<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" >
<channel>
<title>Word Aligned</title>
<link>https://wordaligned.org</link>
<description>tales from the code face</description>
<dc:creator>tag@wordaligned.org</dc:creator>
<language>en-gb</language>
<item>
<title>Priority queues in Python</title>
<description>&lt;p&gt;In the previous article I noted that Python&amp;#8217;s
&lt;a href=&quot;https://docs.python.org/3/library/heapq.html&quot;&gt;heapq&lt;/a&gt; module is the
only part of standard Python I could think of which deals with sorting,
but which doesn&amp;#8217;t give you full control over the sort order.&lt;/p&gt;
&lt;p&gt;That means you need to take care when using a heapq as a priority queue.&lt;/p&gt;
&lt;p&gt;For example, the &lt;a href=&quot;https://en.wikipedia.org/wiki/A*_search_algorithm&quot;&gt;A* search
algorithm&lt;/a&gt; is a
&lt;strong&gt;best first&lt;/strong&gt; path finder. It maintains a priority queue of possible
steps to take, ordered by an estimate of the total distance of the
path routed through these steps. At each stage it pops the next step
&amp;#8212; the one with the shortest estimated total distance &amp;#8212; from the queue,
then updates based on the new position.&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;import heapq

def a_star(start, heuristic, moves):
    &quot;&quot;&quot;A* graph search

    - start: initial state
    - heuristic(s): estimates path length from state s to finish. 
                    heuristic(s) == 0 =&amp;gt; finished
    - moves(s): neighbours of s

    Returns the shortest path to the finish and its cost, on success.
    None otherwise.
    &quot;&quot;&quot;
    costs = {start: 0}
    prev = {start: None}
    frontier = [(heuristic(start), start)]
    heapq.heapify(frontier)

    def path(s):
        return [] if s is None else path(prev[s]) + [s]

    while frontier:
        priority, state = heapq.heappop(frontier)
        if heuristic(state) == 0:
            return costs[state], path(state)
        for n in moves(state):
            n_cost = costs[state] + 1
            if n not in costs or n_cost &amp;lt; costs[n]:
                costs[n] = n_cost
                heapq.heappush(frontier, (n_cost + heuristic(n), n))
                prev[n] = state

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;The code above is a Python A* implementation. For simplicity, we&amp;#8217;ll
assume the cost of each step is 1. It&amp;#8217;s easy enough to adapt the function
if this isn&amp;#8217;t the case.&lt;/p&gt;
&lt;p&gt;The priority queue here is named &lt;code&gt;frontier&lt;/code&gt;, the collection of states
we need to explore. The sequence &lt;code&gt;heapify&lt;/code&gt;, &lt;code&gt;heappop&lt;/code&gt;, &lt;code&gt;heappush&lt;/code&gt;
maintains the priority ordering. (The call to &lt;code&gt;heapify&lt;/code&gt; isn&amp;#8217;t even
needed since a list with one element is already a heap.)  So, each
time we pop a state from the queue, we get the one with the lowest
estimated cost. Then, based on the moves we can make from this new
step, we update our internal records of costs and previous nodes.&lt;/p&gt;
&lt;p&gt;Note that the items on the queue are &lt;code&gt;(cost, state)&lt;/code&gt; pairs. The costs
will be numbers, typically positive integers &amp;#8212; the exact values depend
on the heuristic function.&lt;/p&gt;
&lt;p&gt;Exactly what gets used as &lt;code&gt;state&lt;/code&gt; is up to the caller
which supplies &lt;code&gt;start&lt;/code&gt;, the initial state, and &lt;code&gt;moves&lt;/code&gt;, which steps
from a state to its neighbours.&lt;/p&gt;
&lt;p&gt;However, if items on the queue are tied on &lt;code&gt;cost&lt;/code&gt;, the heapq may need to
compare &lt;code&gt;state&lt;/code&gt; values. If the states have no defined ordering
this results in a runtime error.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;&amp;gt;&amp;gt;&amp;gt; class State: pass
... 
&amp;gt;&amp;gt;&amp;gt; x1, x2 = State(), State()
&amp;gt;&amp;gt;&amp;gt; (1, x1) &amp;lt; (2, x2)
True
&amp;gt;&amp;gt;&amp;gt; (1, x1) &amp;lt; (1, x2)
Traceback (most recent call last):
  File &quot;&amp;lt;stdin&amp;gt;&quot;, line 1, in &amp;lt;module&amp;gt;
TypeError: &#x27;&amp;lt;&#x27; not supported between instances of &#x27;State&#x27; and &#x27;State&#x27;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;We cannot supply a sort key to the &lt;code&gt;heapq&lt;/code&gt; functions. Does this mean
clients must ensure state objects &amp;#8212; whatever they actually are &amp;#8212; can be
compared? As the code stands, yes, but the &lt;a href=&quot;https://docs.python.org/3/library/heapq.html#priority-queue-implementation-notes&quot;&gt;module documentation&lt;/a&gt; has advice on handling this situatation:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;[&amp;#8230;] store entries as 3-element list including the priority, an
entry count, and the task. The entry count serves as a tie-breaker
so that two tasks with the same priority are returned in the order
they were added. And since no two entry counts are the same, the
tuple comparison will never attempt to directly compare two tasks.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This strategy gives a more usable &lt;code&gt;a_star&lt;/code&gt;.&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;import heapq
import itertools

def a_star(start, heuristic, moves):
    costs = {start: 0}
    prev = {start: None}
    seq = itertools.count().__next__
    frontier = [(heuristic(start), seq(), start)]

    def path(s):
        return [] if s is None else path(prev[s]) + [s]

    while frontier:
        _, _, state = heapq.heappop(frontier)
        if heuristic(state) == 0:
            return costs[state], path(state)
        for n in moves(state):
            n_cost = costs[state] + 1
            if n not in costs or n_cost &amp;lt; costs[n]:
                costs[n] = n_cost
                heapq.heappush(frontier, (n_cost + heuristic(n), seq(), n))
                prev[n] = state

&lt;/pre&gt;

&lt;/div&gt;</description>
<dc:date>2022-02-25</dc:date>
<guid>https://wordaligned.org/articles/priority-queues-in-python</guid>
<author>tag@wordaligned.org (Thomas Guest)</author>
<link>https://wordaligned.org/articles/priority-queues-in-python</link>
<category>Python</category>
</item>

<item>
<title>Binary search gets a sort key</title>
<description>&lt;p&gt;Suppose you have an list of distinct elements which has been sorted
and rotated. How would you look up an element within that list?&lt;/p&gt;
&lt;p&gt;For example, the list:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;[7, 11, 13, 19, 2, 3, 5]
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;is sorted (the first 7 primes, in order) and rotated (to put 7 first).&lt;/p&gt;
&lt;p&gt;With this list as input, then:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;look up &lt;code&gt;13&lt;/code&gt; returns &lt;code&gt;2&lt;/code&gt; since &lt;code&gt;13&lt;/code&gt; is at index &lt;code&gt;2&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;look up &lt;code&gt;2&lt;/code&gt; returns &lt;code&gt;4&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;look up &lt;code&gt;4&lt;/code&gt; returns the sentinel value &lt;code&gt;-1&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The obvious technique is to just search the list:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;def lookup(values, v):
    try:
        return values.index(v)
    except IndexError:
        return -1
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This is a linear algorithm which processes the entire list. Is there a way
to take advantage of its sorted+rotated-ness?&lt;/p&gt;
&lt;p&gt;If the list &lt;strong&gt;was&lt;/strong&gt; sorted then we could apply a binary search for a
logarithmic lookup. And in fact, by applying a custom ordering, the
list &lt;strong&gt;is&lt;/strong&gt; sorted.&lt;/p&gt;
&lt;p&gt;How can we apply a custom ordering in Python?&lt;/p&gt;
&lt;p&gt;The way to do this has changed as Python has developed. The table below shows
the evolution of standard Python functions which sort and compare.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Version&lt;/th&gt;
&lt;th&gt;Year&lt;/th&gt;
&lt;th&gt;Functions&lt;/th&gt;
&lt;th&gt;Notes&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;2.0&lt;/td&gt;
&lt;td&gt;2000&lt;/td&gt;
&lt;td&gt;max, min, list.sort([cmp]), bisect.bisect&lt;/td&gt;
&lt;td&gt;Note that [cmp] slows the sorting process down considerably&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2.3&lt;/td&gt;
&lt;td&gt;2003&lt;/td&gt;
&lt;td&gt;heapq&lt;/td&gt;
&lt;td&gt;Also known as the priority queue algorithm&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2.4&lt;/td&gt;
&lt;td&gt;2004&lt;/td&gt;
&lt;td&gt;sorted(iterable[, cmp[, key[, reverse]]]), list.sort([cmp[, key[, reverse]]]), itertools.groupby(iterable[, key)&lt;/td&gt;
&lt;td&gt;In general, the key and reverse conversion processes are much faster than specifying an equivalent cmp function&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2.4&lt;/td&gt;
&lt;td&gt;2004&lt;/td&gt;
&lt;td&gt;heapq.nlargest, heapq.nsmallest&lt;/td&gt;
&lt;td&gt;Heapq extended&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2.5&lt;/td&gt;
&lt;td&gt;2006&lt;/td&gt;
&lt;td&gt;max([key]), min([key]), heapq.nlargest(&amp;#8230;[, key]), heapq.smallest(&amp;#8230;[, key])&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2.6&lt;/td&gt;
&lt;td&gt;2008&lt;/td&gt;
&lt;td&gt;heapq.merge&lt;/td&gt;
&lt;td&gt;Heapq extended again&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3.0&lt;/td&gt;
&lt;td&gt;2008&lt;/td&gt;
&lt;td&gt;sorted(iterable, key[, reverse]), list.sort(key[, reverse])&lt;/td&gt;
&lt;td&gt;No more &lt;code&gt;cmp&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3.5&lt;/td&gt;
&lt;td&gt;2015&lt;/td&gt;
&lt;td&gt;heapq.merge(&amp;#8230;, key)&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3.10&lt;/td&gt;
&lt;td&gt;2021&lt;/td&gt;
&lt;td&gt;bisect.bisect(&amp;#8230;, key) etc&lt;/td&gt;
&lt;td&gt;That leaves the low-level &lt;code&gt;heapq&lt;/code&gt; functions&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;The earliest versions of Python allowed you to sort lists, and
the sort was customised using a &lt;code&gt;cmp&lt;/code&gt; function &amp;#8212; though the documentation
warned there would be a performance penalty[*]. The builtin &lt;code&gt;min&lt;/code&gt; and &lt;code&gt;max&lt;/code&gt;
functions could not be cutomised, and nor could the comparison used in
the &lt;code&gt;bisect&lt;/code&gt; module &amp;#8212; which is Python&amp;#8217;s binary search implementation.&lt;/p&gt;
&lt;p&gt;At 2.3 the &lt;code&gt;heapq&lt;/code&gt; module appeared, but, like &lt;code&gt;bisect&lt;/code&gt;, there was no way
to customise the ordering of heap elements.&lt;/p&gt;
&lt;p&gt;2.4 introduced the &lt;code&gt;key&lt;/code&gt; argument to customise ordering,
noting this should be preferred to &lt;code&gt;cmp&lt;/code&gt;. Unlike &lt;code&gt;cmp&lt;/code&gt; which compares two
elements, &lt;code&gt;key&lt;/code&gt; takes a single element and returns a sort rank for that element.
The &lt;code&gt;key&lt;/code&gt; argument was added alongside &lt;code&gt;cmp&lt;/code&gt; in &lt;code&gt;list.sort&lt;/code&gt; and the new &lt;code&gt;sorted&lt;/code&gt;
builtin function; and &lt;code&gt;key&lt;/code&gt; was the only way to customise the grouping in
&lt;code&gt;itertools.groupby&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;2.5 added &lt;code&gt;key&lt;/code&gt; to &lt;code&gt;min&lt;/code&gt; and &lt;code&gt;max&lt;/code&gt;, and also to &lt;code&gt;heapq.nlargest&lt;/code&gt;, &lt;code&gt;heapq.nsmallest&lt;/code&gt;.
Although these &lt;code&gt;heapq&lt;/code&gt; functions now accept a &lt;code&gt;key&lt;/code&gt;, the lower level heap functions
to heapify, push, pop and replace do not.&lt;/p&gt;
&lt;p&gt;2.6 introduced &lt;code&gt;heapq.merge&lt;/code&gt;, a handy function to merge sorted inputs
using a heap, but with no option to specify a sort key.&lt;/p&gt;
&lt;p&gt;3.0 got rid of &lt;code&gt;cmp&lt;/code&gt;, making &lt;code&gt;key&lt;/code&gt; the only way to customise
sorting. To migrate from Python 2 to 3 any &lt;code&gt;cmp&lt;/code&gt; functions need
converting to &lt;code&gt;key&lt;/code&gt; functions. As with Python 2.5, at 3.0 you could
apply a key to &lt;code&gt;list.sort&lt;/code&gt;, &lt;code&gt;sorted&lt;/code&gt;, &lt;code&gt;min&lt;/code&gt;, &lt;code&gt;max&lt;/code&gt;,
&lt;code&gt;itertools.groupby&lt;/code&gt;, &lt;code&gt;heapq.nlargest&lt;/code&gt;, &lt;code&gt;heapq.nsmallest&lt;/code&gt; &amp;#8212; but not to
&lt;code&gt;bisect&lt;/code&gt; or other &lt;code&gt;heapq&lt;/code&gt; functions.&lt;/p&gt;
&lt;p&gt;3.5 added &lt;code&gt;key&lt;/code&gt; to &lt;code&gt;heapq.merge&lt;/code&gt;, aligning it with &lt;code&gt;heapq.nlargest&lt;/code&gt;
and &lt;code&gt;heapq.nsmallest&lt;/code&gt;, though it remained impossible to use a sort key
with the lower level heap functions.&lt;/p&gt;
&lt;p&gt;The next change came in 3.10, when the &lt;code&gt;key&lt;/code&gt; parameter was
added to the &lt;code&gt;bisect&lt;/code&gt; module. As far as I can tell that means the only part of
standard Python which doesn&amp;#8217;t let you fully customise ordering is the &lt;code&gt;heapq&lt;/code&gt; module.&lt;/p&gt;
&lt;h2 id=&quot;bisect-with-a-search-key&quot;&gt;Bisect with a search key&lt;/h2&gt;
&lt;p&gt;So, to return to the original puzzle, consider our example
sorted+rotated list:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;&amp;gt;&amp;gt;&amp;gt; values = [7, 11, 13, 19, 2, 3, 5]
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The first four elements are greater than or equal to the first
element, 7. The last three elements are less than 7.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;&amp;gt;&amp;gt;&amp;gt; [v &amp;lt; values[0] for v in values]
[False, False, False, False, True, True, True]
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Since &lt;code&gt;False&lt;/code&gt; &amp;lt; &lt;code&gt;True&lt;/code&gt; the result of the list comprehension above is sorted.
Extending this idea, the following comprehension is also sorted.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;&amp;gt;&amp;gt;&amp;gt; [(v &amp;lt; values[0], v) for v in values]
[(False, 7), (False, 11), (False, 13), (False, 19), (True, 2), (True, 3), (True, 5)]
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now we have a logarithmic lookup using binary search:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;from bisect import bisect_left

def lookup(values, v):
    key = lambda x: (x &amp;lt; values[0], x)
    i = bisect_left(values, key(v), key=key)
    return i if (i &amp;lt; len(values) and values[i] == v) else -1
&lt;/code&gt;&lt;/pre&gt;
&lt;h2 id=&quot;dont-search-for-v-search-for-its-key&quot;&gt;Don&amp;#8217;t search for v, search for its key&lt;/h2&gt;
&lt;p&gt;I was surprised to realise the value &lt;code&gt;v&lt;/code&gt; being looked up has to have the &lt;code&gt;key&lt;/code&gt;
function applied &lt;strong&gt;before&lt;/strong&gt; calling &lt;code&gt;bisect_left&lt;/code&gt;. That is, to find where &lt;code&gt;v&lt;/code&gt;
should go in &lt;code&gt;values&lt;/code&gt; to maintain the sort order, we pass &lt;code&gt;key(v)&lt;/code&gt; to &lt;code&gt;bisect_left&lt;/code&gt;.
This doesn&amp;#8217;t match the interface to binary search in other languages. It also means
in our example we have to handle empty lists as a special case.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;def lookup(values, v):
    if not values: return -1
    key = lambda x: (x &amp;lt; values[0], x)
    i = bisect_left(values, key(v), key=key)
    return i if (i &amp;lt; len(values) and values[i] == v) else -1
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;[*] Before the introduction of the sort &lt;code&gt;key&lt;/code&gt;, the standard pattern was
&lt;a href=&quot;https://docs.python.org/3/howto/sorting.html#the-old-way-using-decorate-sort-undecorate&quot;&gt;decorate-sort-undecorate&lt;/a&gt;.&lt;/p&gt;</description>
<dc:date>2022-02-15</dc:date>
<guid>https://wordaligned.org/articles/binary-search-gets-a-sort-key</guid>
<author>tag@wordaligned.org (Thomas Guest)</author>
<link>https://wordaligned.org/articles/binary-search-gets-a-sort-key</link>
<category>Python</category>
</item>

<item>
<title>Python maths updates</title>
<description>&lt;p&gt;A quick note on some useful updates made to standard Python maths support. &lt;a href=&quot;https://docs.python.org/3/library/math.html#math.prod&quot;&gt;Math.prod&lt;/a&gt; does for &lt;code&gt;*&lt;/code&gt; what &lt;a href=&quot;https://docs.python.org/3/library/functions.html#sum&quot;&gt;sum&lt;/a&gt; does for &lt;code&gt;+&lt;/code&gt;. It was added at Python 3.8.&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;&amp;gt;&amp;gt;&amp;gt; math.prod([3, 4, 5])
60
&amp;gt;&amp;gt;&amp;gt; math.prod([])
1

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Added in 3.9, &lt;a href=&quot;https://docs.python.org/3/library/math.html#math.lcm&quot;&gt;math.lcm&lt;/a&gt;, returns the least common multiple of its integer arguments. As with &lt;a href=&quot;https://docs.python.org/3/library/math.html#math.prod&quot;&gt;math.prod&lt;/a&gt;, an empty list of arguments returns 1. Extended in 3.9, the related function &lt;a href=&quot;https://docs.python.org/3/library/math.html#math.gcd&quot;&gt;math.gcd&lt;/a&gt; now accepts an arbitrary list of arguments.&lt;/p&gt;
&lt;p&gt;For Euclidean geometry, &lt;a href=&quot;https://docs.python.org/3/library/math.html#math.hypot&quot;&gt;math.hypot&lt;/a&gt; now supports n-dimensional points, and &lt;a href=&quot;https://docs.python.org/3/library/math.html#math.dist&quot;&gt;math.dist&lt;/a&gt; has been added, again working on any pair of n-dimensional points.&lt;/p&gt;
&lt;p&gt;These useful additions and extensions are all in the standard math module (along with several others not mentioned here). Previously you&amp;#8217;d have to roll your own or rely on e.g. &lt;a href=&quot;https://numpy.org/&quot;&gt;numpy&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;I also wanted to mention a nice extension to the power function; in this case to the builtin &lt;a href=&quot;https://docs.python.org/3/library/functions.html#pow&quot;&gt;pow&lt;/a&gt; &amp;#8212; &lt;a href=&quot;https://docs.python.org/3/library/math.html#math.pow&quot;&gt;math.pow&lt;/a&gt; remains as it was. Builtin &lt;a href=&quot;https://docs.python.org/3/library/functions.html#pow&quot;&gt;pow&lt;/a&gt; takes an optional third parameter &lt;code&gt;pow(base, exp[, mod])&lt;/code&gt;. If &lt;code&gt;mod&lt;/code&gt; is specified it must be a non-zero integer, and both &lt;code&gt;base&lt;/code&gt; and &lt;code&gt;exp&lt;/code&gt; must also be integers. The significant but subtle change made at 3.8 is that &lt;code&gt;exp&lt;/code&gt; can now be a negative integer, enabling modular inverse calculations. &lt;/p&gt;
&lt;p&gt;Quoting the &lt;a href=&quot;https://docs.python.org/3/library/functions.html#pow&quot;&gt;documentation&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;&amp;#8230; If &lt;i&gt;mod&lt;/i&gt; is present and &lt;i&gt;exp&lt;/i&gt; is negative, &lt;i&gt;base&lt;/i&gt; must be relatively prime to &lt;i&gt;mod&lt;/i&gt;. In that case, &lt;i&gt;pow(inv_base, -exp, mod)&lt;/i&gt; is returned, where inv_base is an inverse to base modulo mod.&lt;/p&gt;&lt;p&gt;Here&amp;#8217;s an example of computing an inverse for 38 modulo 97:&lt;/p&gt;&lt;code&gt;&lt;pre&gt;&gt;&gt;&gt; pow(38, -1, mod=97)
23
&gt;&gt;&gt; 23 * 38 % 97 == 1
True
&lt;/pre&gt;&lt;/code&gt;&lt;/blockquote&gt;</description>
<dc:date>2021-01-03</dc:date>
<guid>https://wordaligned.org/articles/python-maths-updates</guid>
<author>tag@wordaligned.org (Thomas Guest)</author>
<link>https://wordaligned.org/articles/python-maths-updates</link>
<category>Python</category>
</item>

<item>
<title>Complex numbers for planar geometry</title>
<description>&lt;p&gt;Once again, I&amp;#8217;m enjoying solving &lt;a href=&quot;http://was.tl&quot;&gt;Eric Wastl&lt;/a&gt;&amp;#8217;s excellent &lt;a href=&quot;https://adventofcode.com&quot;&gt;Advent of Code&lt;/a&gt; puzzles.&lt;/p&gt;
&lt;p&gt;Today, &lt;a href=&quot;day12&quot;&gt;day 12&lt;/a&gt; involved a ship navigating in a 2D plane. The ship follows a series of instructions taking the form of actions and values such as &lt;code&gt;F10 N3 F7 R90 F11 ...&lt;/code&gt;, where, for the first part:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;actions N, S, E, W step by the value in the compass directions N, S, E, W&lt;/li&gt;
&lt;li&gt;actions L, R turn the ship left and right by the number of degrees specified by the value&lt;/li&gt;
&lt;li&gt;action F advances the ship in the direction it faces by the given value&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Perhaps the most obvious way to model this is by implementing a &lt;code&gt;Point&lt;/code&gt; class, with &lt;code&gt;x&lt;/code&gt; and &lt;code&gt;y&lt;/code&gt; values, and suitable methods to advance and rotate.&lt;/p&gt;
&lt;p&gt;Another way is to use complex numbers, which are natively supported by Python &amp;#8212; and many other languages. The complex plane &lt;em&gt;is&lt;/em&gt; a 2D plane. The point &lt;code&gt;(x, y)&lt;/code&gt; is represented by a single complex number &lt;code&gt;p = x + y * j&lt;/code&gt;. Stepping in any direction corresponds to addition, and rotation to multiplication.&lt;/p&gt;
&lt;p&gt;So, the instructions &lt;code&gt;N, S, E, W&lt;/code&gt; map to adding complex numbers &lt;code&gt;j, -j, 1, -1&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Quarter turns left and right map to multiplication by &lt;code&gt;j&lt;/code&gt; and &lt;code&gt;-j&lt;/code&gt;.&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;def ship_position(instructions):
    &amp;#x27;&amp;#x27;&amp;#x27;Return the final position of the ship after following the instructions

    Instructions are a series of (action, value) pairs.
    &amp;#x27;&amp;#x27;&amp;#x27;
    ship, direction = 0, 1 # Start at the origin facing E
    step = {&amp;#x27;N&amp;#x27;:1j, &amp;#x27;W&amp;#x27;:-1, &amp;#x27;S&amp;#x27;:-1j, &amp;#x27;E&amp;#x27;: 1}
    turn = {&amp;#x27;R&amp;#x27;:-1j, &amp;#x27;L&amp;#x27;:1j}
    for action, value in instructions:
        if action in &amp;#x27;NSEW&amp;#x27;:
            ship += value * step[action]
        elif action in &amp;#x27;LR&amp;#x27;:
            assert value % 90 == 0 # check rectilinear motion
            direction *= turn[action] ** (value//90)
        else:
            assert action == &amp;#x27;F&amp;#x27;
            ship += value * direction
    return ship

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;It&amp;#8217;s also possible to implement the conditionals above arithmetically. We can think of each instruction advancing the ship (possibly by zero) and turning it (possibly by zero degrees).&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;def ship_position(instructions):
    ship, direction = 0, 1
    step = {&amp;#x27;N&amp;#x27;:1j, &amp;#x27;W&amp;#x27;:-1, &amp;#x27;S&amp;#x27;:-1j, &amp;#x27;E&amp;#x27;: 1}
    turn = {&amp;#x27;R&amp;#x27;:-1j, &amp;#x27;L&amp;#x27;:1j}
    for action, value in instructions:
        ship += value * (step.get(action, 0) + (action==&amp;#x27;F&amp;#x27;) * direction)
        direction *= turn.get(action, 1) ** (value//90)
    return ship

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;More solutions &lt;a href=&quot;https://github.com/wordaligned/advent-of-code-2020&quot;&gt;here&lt;/a&gt;, and thanks again to Eric Wastl.&lt;/p&gt;</description>
<dc:date>2020-12-12</dc:date>
<guid>https://wordaligned.org/articles/complex-numbers-for-planar-geometry</guid>
<author>tag@wordaligned.org (Thomas Guest)</author>
<link>https://wordaligned.org/articles/complex-numbers-for-planar-geometry</link>
<category>Python</category>
</item>

<item>
<title>Aligning the first line of a triple-quoted string in Python</title>
<description>&lt;p&gt;Python&amp;#8217;s triple-quoted strings are a convenient syntax for strings where the contents span multiple lines. Unescaped newlines are allowed in triple-quoted strings. So, rather than write:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;song = (&quot;Happy birthday to you\n&quot;
        &quot;Happy birthday to you\n&quot;
        &quot;Happy birthday dear Gail\n&quot;
        &quot;Happy birthday to you\n&quot;)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;you can write:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;song = &quot;&quot;&quot;Happy birthday to you
Happy birthday to you
Happy birthday dear Gail
Happy birthday to you
&quot;&quot;&quot;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The only downside here is that the first line doesn&amp;#8217;t align nicely with the lines which follow. The way around this is to embed a &lt;code&gt;\newline&lt;/code&gt; escape sequence, meaning both &lt;a href=&quot;https://docs.python.org/3/reference/lexical_analysis.html#string-and-bytes-literals&quot; title=&quot;String and Bytes literals&quot;&gt;backslash and newline are ignored&lt;/a&gt;.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;song = &quot;&quot;&quot;\
Happy birthday to you
Happy birthday to you
Happy birthday dear Gail
Happy birthday to you
&quot;&quot;&quot;
&lt;/code&gt;&lt;/pre&gt;</description>
<dc:date>2019-02-17</dc:date>
<guid>https://wordaligned.org/articles/aligning-the-first-line-of-a-triplequoted-string-in-python</guid>
<author>tag@wordaligned.org (Thomas Guest)</author>
<link>https://wordaligned.org/articles/aligning-the-first-line-of-a-triplequoted-string-in-python</link>
<category>Python</category>
</item>

<item>
<title>Python Counters @PyDiff</title>
<description>&lt;p&gt;On Monday I gave a talk at &lt;a href=&quot;http://www.pydiff.wales/&quot;&gt;PyDiff&lt;/a&gt; on the subject of Python Counters. A Counter is a specialised &lt;code&gt;dict&lt;/code&gt; which has much in common with a &lt;code&gt;set&lt;/code&gt;. Of course all dicts are like sets, but with Counters the resemblance is even stronger. The &lt;a href=&quot;https://docs.python.org/3/library/collections.html#collections.Counter&quot;&gt;documentation&lt;/a&gt; states:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The Counter class is similar to bags or multisets in other languages.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Alongside set operations of union, intersection and is-a-subset, Counter also supports addition and subtraction &amp;#8212; natural and unsurprising operations for a container whose job is to keep count of its elements. If you want to unite the contents of two Counters, it&amp;#8217;s probably &lt;code&gt;+&lt;/code&gt; you want rather than &lt;code&gt;|&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Counters came to mind as a lightning talk subject since I had a go at the &lt;a href=&quot;https://adventofcode.com/2018&quot; title=&quot;25 Christmas themed puzzles made available one-per-day as a countdown to Christmas&quot;&gt;Advent of Code&lt;/a&gt; last year and used no fewer than 12 Counters in my &lt;a href=&quot;https://github.com/wordaligned/advent-of-code-2018/blob/master/solutions.ipynb&quot;&gt;solutions&lt;/a&gt; to 25 puzzles &amp;#8212; and that total could well increase since I haven&amp;#8217;t finished yet.&lt;/p&gt;
&lt;p&gt;The talk itself is on &lt;a href=&quot;https://github.com/wordaligned/python-counters&quot;&gt;github&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;https://github.com/wordaligned/python-counters&quot;&gt;&lt;img alt=&quot;Coins&quot; src=&quot;https://wordaligned.org/images/coins.jpg&quot;&gt;&lt;/a&gt;&lt;/p&gt;</description>
<dc:date>2019-01-19</dc:date>
<guid>https://wordaligned.org/articles/python-counters-pydiff</guid>
<author>tag@wordaligned.org (Thomas Guest)</author>
<link>https://wordaligned.org/articles/python-counters-pydiff</link>
<category>Python</category>
</item>

<item>
<title>Creating a dict of lists in Python</title>
<description>&lt;p&gt;Suppose you have a list of objects which you want to convert into a dict mapping from some object key to the (sub-)list of objects with that key. To provide a simple example, let&amp;#8217;s start with a list of fruits.&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;from collections import namedtuple

Fruit = namedtuple(&amp;#x27;Fruit&amp;#x27;, &amp;#x27;name colour&amp;#x27;)

def banana():     return Fruit(&amp;#x27;banana&amp;#x27;, &amp;#x27;yellow&amp;#x27;)
def grape():      return Fruit(&amp;#x27;grape&amp;#x27;, &amp;#x27;green&amp;#x27;)
def pear():       return Fruit(&amp;#x27;pear&amp;#x27;, &amp;#x27;green&amp;#x27;)
def strawberry(): return Fruit(&amp;#x27;strawberry&amp;#x27;, &amp;#x27;red&amp;#x27;)
def cherry():     return Fruit(&amp;#x27;cherry&amp;#x27;, &amp;#x27;red&amp;#x27;)

fruits = [
    banana(), pear(), cherry(), cherry(), pear(),
    grape(), banana(), grape(), cherry(), grape(),
    strawberry(), pear(), grape(), cherry()]

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;We&amp;#8217;d like to arrange a fruitbowl &amp;#8212; a dict which groups fruits by colour. This can be done by creating an empty bowl, then iterating through the fruits placing each in the correct list.&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;fruitbowl = {}

for fruit in fruits:
    fruitbowl.setdefault(fruit.colour, []).append(fruit)

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;code&gt;Dict.setdefault&lt;/code&gt; is a bit of an oddity in Python, both doing something and returning a value, but it&amp;#8217;s a convenient shorthand in this case. Despite this convenience it&amp;#8217;s more common to use a &lt;code&gt;defaultdict&lt;/code&gt;.&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;from collections import defaultdict

fruitbowl = defaultdict(list)

for fruit in fruits:
    fruitbowl[fruit.colour].append(fruit)

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Here&amp;#8217;s a function to display the fruitbowl.&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;def print_bowl(bowl):
    print(&amp;#x27;\n&amp;#x27;.join(
        &amp;#x27;{}: {}&amp;#x27;.format(colour,
                        &amp;#x27;, &amp;#x27;.join(f.name for f in fruits))
        for colour, fruits in bowl.items()))

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;If we call this function, we see the fruits have indeed been grouped by colour.&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;&amp;gt;&amp;gt;&amp;gt; print_bowl(fruitbowl)
yellow: banana, banana
green: pear, pear, grape, grape, grape, pear, grape
red: cherry, cherry, cherry, strawberry, cherry

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;This is all fine and idiomatic Python, but whenever I see an empty dict being created followed by a loop to populate it, I wonder if a comprehension could be used.&lt;/p&gt;
&lt;p&gt;Is there a way to declare and initialise the dict in a single expression? Here&amp;#8217;s the best I came up with.&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;from operator import attrgetter
from itertools import groupby

colour = attrgetter(&amp;#x27;colour&amp;#x27;)

fruitbowl = {
    col: list(fts)
    for col, fts in groupby(sorted(fruits, key=colour), colour)}

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Is this better than the &lt;code&gt;defaultdict&lt;/code&gt; solution. Probably not, but it&amp;#8217;s a technique worth remembering. Maybe the &lt;code&gt;fruitbowl&lt;/code&gt; isn&amp;#8217;t needed, and we actually just need to iterate through the fruits grouped by colour. For example, which colour is most popular?&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;&amp;gt;&amp;gt;&amp;gt; max(fruitbowl.items(), key=lambda kv: len(kv[1]))[0]
&amp;#x27;green&amp;#x27;

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Using &lt;code&gt;groupby&lt;/code&gt;, we don&amp;#8217;t need the bowl.&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;&amp;gt;&amp;gt;&amp;gt; def grouplen(k_gp):
...     return sum(1 for _ in k_gp[1])
&amp;gt;&amp;gt;&amp;gt; max(groupby(sorted(fruits, key=colour), colour), key=grouplen)[0]
&amp;gt;&amp;gt;&amp;gt; &amp;#x27;green&amp;#x27;

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;In this case, we don&amp;#8217;t need &lt;code&gt;groupby&lt;/code&gt; either. &lt;a href=&quot;./timtowtdi-vs-tsboapooowtdi&quot;&gt;There is more than one way to do it&lt;/a&gt;.&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;&amp;gt;&amp;gt;&amp;gt; from collections import Counter
&amp;gt;&amp;gt;&amp;gt; Counter(map(colour, fruits)).most_common(1)
[(&amp;#x27;green&amp;#x27;, 7)]

&lt;/pre&gt;

&lt;/div&gt;</description>
<dc:date>2018-04-29</dc:date>
<guid>https://wordaligned.org/articles/dict-of-lists</guid>
<author>tag@wordaligned.org (Thomas Guest)</author>
<link>https://wordaligned.org/articles/dict-of-lists</link>
<category>Python</category>
</item>

<item>
<title>TIMTOWTDI vs TSBO-APOO-OWTDI</title>
<description>&lt;h2 id=&quot;timtowtdi&quot;&gt;TIMTOWTDI&lt;/h2&gt;
&lt;p&gt;TIMTOWTDI stands for &amp;#8220;There is more than one way to do it&amp;#8221;, an approach promoted by the Perl community.&lt;/p&gt;
&lt;p&gt;The mindset behind it gets explored in more detail by the language&amp;#8217;s creator, Larry Wall, in a talk given in 1999: &lt;a href=&quot;https://www.perl.com/pub/1999/03/pm.html/&quot;&gt;&amp;#8220;Perl, the first postmodern computer language&amp;#8221;&lt;/a&gt;. He attributes the slogan to his daughter, Heidi, who says it&amp;#8217;s a strategy which works well in her maths class; and &lt;strong&gt;she&lt;/strong&gt; associates it with another saying used at school: &amp;#8220;Tsall Good&amp;#8221;. This doesn&amp;#8217;t mean everything is good, or even everything has good bits. It means, overall, things are good. See the big picture.&lt;/p&gt;
&lt;p&gt;Perl epitomises this. It&amp;#8217;s eclectic and inclusive, supporting a variety of styles. One-liner? Fine! Like a shell script? Sure! Structured programming, object-oriented, functional? Why not! Tsall good.&lt;/p&gt;
&lt;p&gt;I like that.&lt;/p&gt;
&lt;p&gt;But do I feel that way about programming?&lt;/p&gt;
&lt;h2 id=&quot;tsbo-apoo-owtdi&quot;&gt;TSBO-APOO-OWTDI&lt;/h2&gt;
&lt;p&gt;A contrasting mantra appears in the &lt;a href=&quot;https://www.python.org/dev/peps/pep-0020/&quot;&gt;Zen of Python&lt;/a&gt;, a list of aphorisms which summarise the guiding principles behind Python&amp;#8217;s design. Item number 13 states &amp;#8220;There should be one &amp;#8212; and preferably only one &amp;#8212; obvious way to do it.&amp;#8221;&lt;/p&gt;
&lt;p&gt;Perhaps realising this sounds overly prescriptive, this rule is tempered by item 14, &amp;#8220;Although that way may not be obvious at first unless you&amp;#8217;re Dutch.&amp;#8221;&lt;/p&gt;
&lt;p&gt;Guido van Rossum, Python&amp;#8217;s BDFL &amp;#8212; Benevolent Dictator For Life &amp;#8212; would be the Dutch person who finds things obvious. That&amp;#8217;s right: &lt;strong&gt;Dictator&lt;/strong&gt;. Programmers don&amp;#8217;t like being told what to do any more than two year olds. How then has Python become so popular? &lt;/p&gt;
&lt;p&gt;Maybe emphasis falls on &lt;strong&gt;should&lt;/strong&gt;. There &lt;strong&gt;should&lt;/strong&gt; be only one obvious way to do it: it&amp;#8217;s just that &amp;#8212; Dutch or otherwise &amp;#8212; we haven&amp;#8217;t got there yet.&lt;/p&gt;
&lt;h2 id=&quot;timtop&quot;&gt;TIMTOP&lt;/h2&gt;
&lt;p&gt;For example, there is more than one Python. Obviously there&amp;#8217;s Python 2 and Python 3, but it&amp;#8217;s less obvious which to use. Don&amp;#8217;t forget &lt;a href=&quot;https://pypy.org/&quot;&gt;PyPy&lt;/a&gt;. Increasingly Python comes packaged with data processing and visualisation extensions, served up as a &lt;a href=&quot;http://jupyter.org/&quot;&gt;Jupyter&lt;/a&gt; notebook.&lt;/p&gt;
&lt;h2 id=&quot;timtopom&quot;&gt;TIMTOPOM&lt;/h2&gt;
&lt;p&gt;There is more than one program options module.&lt;/p&gt;
&lt;p&gt;When I started with Python there was &lt;a href=&quot;https://docs.python.org/3/library/getopt.html&quot;&gt;getopt&lt;/a&gt;, the one and only command line handler. Coming from a C/C++ background I was quite happy to use something resembling GNU&amp;#8217;s getopt. Then &lt;a href=&quot;https://docs.python.org/3/library/optparse.html&quot;&gt;optparse&lt;/a&gt; appeared. Now there&amp;#8217;s &lt;a href=&quot;https://docs.python.org/3/library/argparse.html&quot;&gt;argparse&lt;/a&gt;. All of these libraries are readily available. Which should I use? Not optparse, that&amp;#8217;s deprecated, unless I&amp;#8217;m already using it and it works, that is. Regarding the other contenders, the &lt;a href=&quot;https://docs.python.org/3/library/getopt.html&quot;&gt;documentation&lt;/a&gt; archly notes:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Users who are unfamiliar with the C &lt;a href=&quot;http://man7.org/linux/man-pages/man3/getopt.3.html&quot;&gt;&lt;code&gt;getopt()&lt;/code&gt;&lt;/a&gt; function or who would like to write less code and get better help and error messages should consider using the &lt;a href=&quot;https://docs.python.org/3/library/argparse.html#module-argparse&quot;&gt;argparse module&lt;/a&gt; instead.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;There are other non-standard Python options for parsing a command line too: ones which &lt;a href=&quot;http://docopt.org/&quot;&gt;generate code from the usage&lt;/a&gt; notes, or by &lt;a href=&quot;https://github.com/google/python-fire&quot;&gt;inspecting the code you want to expose&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;There is more than one way to do it.&lt;/p&gt;
&lt;h2 id=&quot;timtoutf&quot;&gt;TIMTOUTF&lt;/h2&gt;
&lt;p&gt;There is more than one unit test framework. The obvious one, &lt;a href=&quot;https://docs.python.org/3/library/unittest.html&quot;&gt;unittest&lt;/a&gt;, like getopt, draws inspiration from elsewhere &amp;#8212; in this case Java&amp;#8217;s Junit. Unfortunately the port is too faithful, and you&amp;#8217;ll have to inherit from super classes etc to test something. I much prefer &lt;a href=&quot;https://docs.pytest.org&quot;&gt;PyTest&lt;/a&gt;, which flexes the language itself to deliver test assertions as &lt;code&gt;assert&lt;/code&gt;s.&lt;/p&gt;
&lt;p&gt;There&amp;#8217;s also a &lt;a href=&quot;https://docs.python.org/3/library/doctest.html&quot;&gt;doctest&lt;/a&gt; module in the standard library which executes and checks code found in strings (hold that thought!), and there are many other non-standard testing frameworks.&lt;/p&gt;
&lt;p&gt;There is more than one way to do it.&lt;/p&gt;
&lt;h2 id=&quot;timtowofs&quot;&gt;TIMTOWOFS&lt;/h2&gt;
&lt;p&gt;There is more than one way of formatting strings.&lt;/p&gt;
&lt;p&gt;As we&amp;#8217;ve seen there&amp;#8217;s more than one Python, and libraries are always up for reinvention. This is arguably evolution rather than a multiplicity of options. That is, the most recent way to do it should be preferred.&lt;/p&gt;
&lt;p&gt;When it comes to string formatting, though, there has &lt;strong&gt;always&lt;/strong&gt; been more than one way to do it, and more ways are still being added.&lt;/p&gt;
&lt;p&gt;Do you use &lt;code&gt;&#x27;single&#x27;&lt;/code&gt; or &lt;code&gt;&quot;double&quot;&lt;/code&gt; quotes for a string? &lt;code&gt;&quot;&quot;&quot;Triple&quot;&quot;&quot;&lt;/code&gt; quotes. Raw strings? Raw with an &lt;code&gt;r&lt;/code&gt; or Raw with an &lt;code&gt;R&lt;/code&gt;? TIMTOWTDI.&lt;/p&gt;
&lt;p&gt;What if you want to embed the value of a variable in a string? Users familiar with C&amp;#8217;s &lt;code&gt;printf()&lt;/code&gt; function might prefer &lt;code&gt;%&lt;/code&gt; formatting. Fans of $shell $parameter $expansion can use &lt;a href=&quot;https://docs.python.org/3/library/string.html#template-strings&quot;&gt;template strings&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;https://www.python.org/dev/peps/pep-3101/&quot;&gt;Advanced string formatting&lt;/a&gt; &amp;#8212; &lt;code&gt;str.format&lt;/code&gt; &amp;#8212; appeared in Python 3.0, backported to Python 2.6. No doubt it has advantages over &lt;code&gt;%&lt;/code&gt; formatting, but for me it&amp;#8217;s a little more obscure and a little less obvious. Python 3.6 introduces &lt;a href=&quot;https://www.python.org/dev/peps/pep-0498/&quot;&gt;f-strings&lt;/a&gt; which build on &lt;code&gt;str.format&lt;/code&gt; and knock down my reservations. The syntax allows you to evaluate expressions in strings: evidently Python is heading in Perl&amp;#8217;s direction.&lt;/p&gt;
&lt;h2 id=&quot;apsdotadiw&quot;&gt;APSDOTADIW&lt;/h2&gt;
&lt;p&gt;Let&amp;#8217;s finish by returning to Perl, and to &lt;a href=&quot;https://www.perl.com/pub/1999/03/pm.html/&quot;&gt;Larry Wall&amp;#8217;s 1999 talk&lt;/a&gt;. &lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;How many times have we heard the mantra that a program should do one thing and do it well?&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Perl is not that program. Perl wants to do everything well. It integrates features and makes no attempt to homogenise them.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;You&amp;#8217;ve all heard the saying: If all you have is a hammer, everything starts to look like a nail.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Perl is no hammer: it has memorably been described as a Swiss army chainsaw, but Larry Wall likens it to a more conventional tool.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;If all you have is duct tape, everything starts to look like a duct. Right. When&amp;#8217;s the last time you used duct tape on a duct?&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Python may aspire to offer a single obvious way to do something. It fails splendidly, being more duct tape than hammer.&lt;/p&gt;
&lt;hr /&gt;
&lt;p&gt;I presented this blog post as a &lt;a href=&quot;https://www.meetup.com/PyDiff/events/249220768&quot;&gt;lightning talk at PyDiff&lt;/a&gt; a couple of days ago. The &lt;a href=&quot;https://wordaligned.org/docs/timtowtdi&quot;&gt;slides are here&lt;/a&gt;. The talk was recorded too: &lt;a href=&quot;https://cardiff.cloud.panopto.eu/Panopto/Pages/Viewer.aspx?id=324c4015-51ed-4bd7-a1c7-a8c50089e746&quot;&gt;I appear about 24 minutes in&lt;/a&gt;.&lt;/p&gt;</description>
<dc:date>2018-04-19</dc:date>
<guid>https://wordaligned.org/articles/timtowtdi-vs-tsboapooowtdi</guid>
<author>tag@wordaligned.org (Thomas Guest)</author>
<link>https://wordaligned.org/articles/timtowtdi-vs-tsboapooowtdi</link>
<category>Python</category>
</item>

<item>
<title>Advent of Code 2017</title>
<description>&lt;p&gt;A big thanks to &lt;a href=&quot;http://was.tl&quot;&gt;Eric Wastl&lt;/a&gt; for another great &lt;a href=&quot;http://adventofcode.com/2017&quot;&gt;Advent of Code&lt;/a&gt;. Inspired by &lt;a href=&quot;https://github.com/norvig/pytudes/blob/master/ipynb/Advent%20of%20Code.ipynb&quot;&gt;Peter Norvig&lt;/a&gt;, I&amp;#8217;ve published &lt;a href=&quot;https://github.com/wordaligned/advent-of-code-2017/blob/master/advent-of-code-2017.ipynb&quot;&gt;my solutions&lt;/a&gt; as a Jupyter notebook.&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;https://github.com/wordaligned/advent-of-code-2017/blob/master/advent-of-code-2017.ipynb&quot;&gt;&lt;img src=&quot;https://raw.githubusercontent.com/wordaligned/advent-of-code-2017/8e8344b4c5fc00827e42576059f36389dcfa453c/done.png&quot; alt=&quot;Done!&quot;/&gt;&lt;/a&gt;&lt;/p&gt;</description>
<dc:date>2018-01-12</dc:date>
<guid>https://wordaligned.org/articles/advent-of-code-2017</guid>
<author>tag@wordaligned.org (Thomas Guest)</author>
<link>https://wordaligned.org/articles/advent-of-code-2017</link>
<category>Python</category>
</item>

<item>
<title>Unleash the test army</title>
<description>&lt;h2 id=&quot;are-the-tests-adequate&quot;&gt;Are the tests adequate?&lt;/h2&gt;
&lt;p&gt;Recently I described a solution to &lt;a href=&quot;https://wordaligned.org/articles/slicing-a-list-evenly-with-python&quot;&gt;the problem of dividing a list into evenly sized chunks&lt;/a&gt;. It&amp;#8217;s a simple enough problem with just two inputs: the list (or other sliceable container) &lt;code&gt;xs&lt;/code&gt; and the number of chunks &lt;code&gt;n&lt;/code&gt;. Nonetheless, there are traps to avoid and special cases to consider &amp;#8212; what if &lt;code&gt;n&lt;/code&gt; is larger than the list, for example? Must the chunks comprise contiguous elements from the original list?&lt;/p&gt;
&lt;p&gt;The tests I came up with are straightforward and uninspiring. They were developed within the context of my own assumptions about the solution and the special cases I could imagine. They were written after the implementation &amp;#8212; which is to say, development wasn&amp;#8217;t driven by tests. They are whitebox tests, designed to cover the various paths through the code based on my insider knowledge.&lt;/p&gt;
&lt;p&gt;Are these tests adequate? Certainly they don&amp;#8217;t accurately represent the data which will hit the algorithm in practice. Can we be sure we haven&amp;#8217;t missed anything? Would the tests still cover all paths if the implementation changed?&lt;/p&gt;
&lt;h2 id=&quot;property-based-testing&quot;&gt;Property based testing&lt;/h2&gt;
&lt;p&gt;David R MacIver described another, complementary, approach at &lt;a href=&quot;https://accu.org/index.php/conferences/accu_conference_2016/accu2016_sessions#The_Plural_of_Anecdote_is_not_Test_Suite&quot;&gt;a talk I attended at ACCU 2016&lt;/a&gt;. In the talk abstract he characterises the (class of) tests I&amp;#8217;d written as &lt;strong&gt;anecdotal&lt;/strong&gt; &amp;#8212; &amp;#8220;let me tell you about this time I called a function &amp;#8230; and then it returned this .. and then it raised an exception &amp;#8230; etc. etc.&amp;#8221;&lt;/p&gt;
&lt;p&gt;How about if the test suite instead describes the &lt;strong&gt;properties&lt;/strong&gt; required of the system under test, and then conjures up inputs designed to see if these properties hold under stress? So, rather than our test suite being a limited set of input/output pairs, it becomes an executable specification validated by a robot army.&lt;/p&gt;
&lt;p&gt;&lt;a data-flickr-embed=&quot;true&quot;  href=&quot;https://www.flickr.com/photos/avatarr8/3819520612&quot; title=&quot;China&amp;#x27;s Robot Army&quot;&gt;&lt;img src=&quot;https://c1.staticflickr.com/3/2431/3819520612_4b32f2b423.jpg&quot; width=&quot;500&quot; height=&quot;334&quot; alt=&quot;China&amp;#x27;s Robot Army&quot;&gt;&lt;/a&gt;&lt;script async src=&quot;https://wordaligned.org//embedr.flickr.com/assets/client-code.js&quot; charset=&quot;utf-8&quot;&gt;&lt;/script&gt;&lt;/p&gt;
&lt;p&gt;&lt;span id=&quot;continue-reading&quot;/&gt;&lt;/p&gt;
&lt;h2 id=&quot;hypothesis&quot;&gt;Hypothesis&lt;/h2&gt;
&lt;p&gt;This approach sounds compelling but I had my doubts. I also had my doubts about the adequacy of both my code and tests. A perfect opportunity, then, to try out &lt;a href=&quot;https://hypothesis.readthedocs.io/en/latest/&quot;&gt;Hypothesis&lt;/a&gt;, an open source property-based testing library developed by David MacIver.&lt;/p&gt;
&lt;p&gt;I used the Python version of the library, which is the primary implementation. The rest of this article describes my experience of using &lt;a href=&quot;https://hypothesis.readthedocs.io/en/latest/&quot;&gt;hypothesis&lt;/a&gt; for the first time: I&amp;#8217;m not claiming expertise.&lt;/p&gt;
&lt;h2 id=&quot;first-impressions&quot;&gt;First impressions&lt;/h2&gt;
&lt;p&gt;Excellent!&lt;/p&gt;
&lt;p&gt;Installation was the usual &lt;code&gt;pip&lt;/code&gt; invocation. The &lt;a href=&quot;https://hypothesis.readthedocs.io/en/latest/&quot;&gt;documentation&lt;/a&gt; is written with care. It&amp;#8217;s evident the library is mature, supported and actively developed. It&amp;#8217;s licensed under the &lt;a href=&quot;https://www.mozilla.org/en-US/MPL/2.0/&quot;&gt;Mozilla Public License&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&quot;my-first-test&quot;&gt;My first test&lt;/h2&gt;
&lt;p&gt;Recall that the code I wanted to test reads:&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;def chunk(xs, n):
    &amp;#x27;&amp;#x27;&amp;#x27;Split the list, xs, into n evenly sized chunks&amp;#x27;&amp;#x27;&amp;#x27;
    L = len(xs)
    assert 0 &amp;lt; n &amp;lt;= L
    s, r = divmod(L, n)
    t = s + 1
    return ([xs[p:p+t] for p in range(0, r*t, t)] +
            [xs[p:p+s] for p in range(r*t, L, s)])

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;I also proposed a second &lt;code&gt;itertools&lt;/code&gt; based version:&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;from itertools import accumulate, chain, repeat, tee

def chunk(xs, n):
    &amp;#x27;&amp;#x27;&amp;#x27;Split the list, xs, into n evenly sized chunks&amp;#x27;&amp;#x27;&amp;#x27;
    assert n &amp;gt; 0
    L = len(xs)
    s, r = divmod(L, n)
    widths = chain(repeat(s+1, r), repeat(s, n-r))
    offsets = accumulate(chain((0,), widths))
    b, e = tee(offsets)
    next(e)
    return [xs[s] for s in map(slice, b, e)]

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;The first thing you notice when thinking about a property based test is that the specification &amp;#8212; the function&amp;#8217;s docstring &amp;#8212; doesn&amp;#8217;t describe the exact form of the output. In fact, as a comment on the article pointed out, my own interpretation of the specification is not the only one, and allowing the chunks to be formed from non-contiguous items permits a particularly elegant solution.&lt;/p&gt;
&lt;p&gt;Also, if the list doesn&amp;#8217;t divide exactly into &lt;code&gt;n&lt;/code&gt; chunks, what should the result be? Well, although I&amp;#8217;d have been happy with any evenly-chunked solution, my conventional unit tests &lt;strong&gt;assumed&lt;/strong&gt; an implementation which placed the larger chunks first.&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;def test_chunk():
    assert chunk(&amp;#x27;&amp;#x27;, 1) == [&amp;#x27;&amp;#x27;]
    assert chunk(&amp;#x27;ab&amp;#x27;, 2) == [&amp;#x27;a&amp;#x27;, &amp;#x27;b&amp;#x27;]
    assert chunk(&amp;#x27;abc&amp;#x27;, 2) == [&amp;#x27;ab&amp;#x27;, &amp;#x27;c&amp;#x27;]

    xs = list(range(8))
    assert chunk(xs, 2) == [[0, 1, 2, 3], [4, 5, 6, 7]]
    assert chunk(xs, 3) == [[0, 1, 2], [3, 4, 5], [6, 7]]
    assert chunk(xs, 5) == [[0, 1], [2, 3], [4, 5], [6], [7]]

    rs = range(1000000)
    assert chunk(rs, 2) == [range(500000), range(500000, 1000000)]

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Notice, by the way, that although the docstring only mentions lists, I can&amp;#8217;t resist demonstrating the algorithm also behaves for strings and ranges &amp;#8212; for any sliceable sequence, in fact.&lt;/p&gt;
&lt;p&gt;Here&amp;#8217;s what I started with when I tried specifying the &amp;#8220;evenly sized&amp;#8221; property using hypothesis.&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;def test_evenly_chunked(xs, n):
    chunks = chunk(xs, n)
    assert len(chunks) == n
    chunk_lens = {len(c) for c in chunks}
    assert len(chunk_lens) in {1, 2}
    assert max(chunk_lens) - min(chunk_lens) in {0, 1}

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;This first test case defines &amp;#8220;evenly sized&amp;#8221;, stating that the result comprises &lt;code&gt;n&lt;/code&gt; chunks, that the set of the lengths of these chunks is either 1 (all chunks the same size) or 2, and the maximum chunk length is equal to or one more than the minumum chunk length.&lt;/p&gt;
&lt;p&gt;This doesn&amp;#8217;t fully specify the function. We also need assertions which confirm that recombining the chunks produces the original sequence.&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;def test_combining_chunks(xs_n):
    pass # We&amp;#x27;ll come back to this!

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;We&amp;#8217;ll come back to this later!&lt;/p&gt;
&lt;p&gt;Now, &lt;code&gt;test_evenly_chunked()&lt;/code&gt; looks quite like a conventional test function. It just needs some input values. Rather than poke the function with some hand-chosen values, we can let hypothesis have a go.&lt;/p&gt;
&lt;p&gt;Based on a read of the &lt;a href=&quot;https://hypothesis.readthedocs.io/en/latest/quickstart.html&quot;&gt;Quick start guide&lt;/a&gt; I tried this:&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;import hypothesis as ht
import hypothesis.strategies as st

@ht.given(xs=st.lists(), n=st.integers())
def test_evenly_chunked(xs, n):
    chunks = chunk(xs, n)
    assert len(chunks) == n
    chunk_lens = {len(c) for c in chunks}
    assert len(chunk_lens) in {1, 2}
    assert max(chunk_lens) - min(chunk_lens) in {0, 1}

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;As you can see, the test function pre-conditions are encapsulated in a &lt;code&gt;hypothesis.given&lt;/code&gt; decorator, which specifies the use of &lt;code&gt;hypothesis.strategies.lists()&lt;/code&gt; and &lt;code&gt;hypothesis.strategies.integers()&lt;/code&gt; to generate test values for &lt;code&gt;xs&lt;/code&gt; and &lt;code&gt;n&lt;/code&gt; respectively.&lt;/p&gt;
&lt;p&gt;The result was a lengthy but helpful failure, which printed out the documentation of the &lt;code&gt;lists()&lt;/code&gt; strategy and the usage error:&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;hypothesis.errors.InvalidArgument: Cannot create non-empty lists without an element type

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;OK then. The function doesn&amp;#8217;t really care about the element type. Integers will do.&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;@ht.given(xs=st.lists(st.integers()), n=st.integers())
def test_evenly_chunked(xs, n):
    ....

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;This gave me an error, along with a minimal test case which produces it.&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;xs = [], n = 0

    def chunk(xs, n):
        &amp;#x27;&amp;#x27;&amp;#x27;Split the list, xs, into n evenly sized chunks&amp;#x27;&amp;#x27;&amp;#x27;
        L = len(xs)
&amp;gt;       assert 0 &amp;lt; n &amp;lt;= L
E       AssertionError

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Our function, &lt;code&gt;chunk()&lt;/code&gt; requires the value &lt;code&gt;n&lt;/code&gt; to be in the closed range &lt;code&gt;(0, len(xs)]&lt;/code&gt;. Looking more closely at the failure, we can see that the function under test, &lt;code&gt;chunk()&lt;/code&gt;, isn&amp;#8217;t great, since we won&amp;#8217;t be able to split an empty list into &lt;strong&gt;any&lt;/strong&gt; number of chunks since, in this case, &lt;code&gt;L&lt;/code&gt; is zero and no value of &lt;code&gt;n&lt;/code&gt; satisfies &lt;code&gt;0 &amp;lt; n &amp;lt;= L&lt;/code&gt;. &lt;/p&gt;
&lt;p&gt;At this point I had to makes some choices:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;should my tests confirm &lt;code&gt;chunk()&lt;/code&gt; was checking pre-conditions (by catching the &lt;code&gt;AssertionError&lt;/code&gt;)?&lt;/li&gt;
&lt;li&gt;should my function handle the case when &lt;code&gt;n &amp;gt; L&lt;/code&gt;? It&amp;#8217;s not the intended use of the function, but it can be handled.&lt;/li&gt;
&lt;li&gt;what about when &lt;code&gt;n == 0&lt;/code&gt;? Splitting a non-empty list into &lt;code&gt;0&lt;/code&gt; chunks is impossible, but I guess an empty list can be split into &lt;code&gt;0&lt;/code&gt; chunks.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&quot;my-second-test&quot;&gt;My second test&lt;/h2&gt;
&lt;p&gt;I made some decisions.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;I decided not to test the pre-condition assertions. Instead, I&amp;#8217;d modify the test strategy to pass in valid inputs.&lt;/li&gt;
&lt;li&gt;I decided I&amp;#8217;d go with the &lt;code&gt;itertools&lt;/code&gt; chunk function which naturally handles &lt;code&gt;n &amp;gt; L&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;I decided my function needn&amp;#8217;t handle &lt;code&gt;n == 0&lt;/code&gt;, even when &lt;code&gt;xs == []&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Here&amp;#8217;s the modified test code&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;@ht.given(xs=st.lists(st.integers()),
          n=st.integers(min_value=1))
def test_evenly_chunked(xs, n):
    chunks = chunk(xs, n)
    assert len(chunks) == n
    chunk_lens = {len(c) for c in chunks}
    assert len(chunk_lens) in {1, 2}
    assert max(chunk_lens) - min(chunk_lens) in {0, 1}
    ....

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;When I tried running the tests again, they appeared to hang until I interrupted them.&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;&amp;gt; py.test
============================= test session starts =============================
platform win32 -- Python 3.5.3, pytest-3.0.7, py-1.4.33, pluggy-0.4.0
....
plugins: hypothesis-3.8.3
collected 1 items

test_chunk.py   C-c C-c

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! KeyboardInterrupt !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
to show a full traceback on KeyboardInterrupt use --fulltrace

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Now, I had a suspicion that &lt;code&gt;chunk()&lt;/code&gt; couldn&amp;#8217;t really handle &lt;strong&gt;any&lt;/strong&gt; input integers &amp;#8212; it was designed for a value of &lt;code&gt;n&lt;/code&gt; equal to &lt;code&gt;multiprocessing.cpu_count()&lt;/code&gt; &amp;#8212; but I wanted to see what would happen with no upper limits. Here was my answer. Running and interrupting again with &lt;code&gt;--fulltrace&lt;/code&gt; set, I got several pages of output ending with the test inputs:&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;xs = [50], n = 67108865

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Evidently my code was taking a while to create a list comprising a single list &lt;code&gt;[50]&lt;/code&gt; and over 67 million empty lists &lt;code&gt;[], [], [], ...&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Once again, I had a decision to make. Perhaps unsurprisingly, it&amp;#8217;s a decision I&amp;#8217;d already faced. I could make &lt;code&gt;chunk()&lt;/code&gt; a generator function, yielding the chunks one at a time &amp;#8212; a trivial and natural change to the &lt;code&gt;itertools&lt;/code&gt; based implementation &amp;#8212; or I could constrain the tests to more suitable values of &lt;code&gt;n&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;In this case I decided to stick with what I had: my function would accept a list and return a list (of lists). In an attempt to get some passing tests, I set a maximum value on &lt;code&gt;n&lt;/code&gt;.&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;@ht.given(xs=st.lists(st.integers()),
          n=st.integers(min_value=1, max_value=100))
def test_evenly_chunked(xs, n):
    ....

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;At last, I had a passing test.&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;py.test 
============================= test session starts =============================
platform win32 -- Python 3.5.3, pytest-3.0.7, py-1.4.33, pluggy-0.4.0
rootdir: /work/sliced-python, inifile:
plugins: hypothesis-3.8.3
collected 1 items

test_chunk.py .

========================== 1 passed in 0.23 seconds ===========================

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Building on this success, I wanted to confirm the function also handled other sliceable types &amp;#8212; strings and bytes specifically. Hypothesis provides a &lt;code&gt;one_of&lt;/code&gt; strategy for combining other strategies.&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;@ht.given(xs=st.one_of(st.text(),
                       st.binary(),
                       st.lists(st.integers())),
          n=st.integers(min_value=1, max_value=100))
def test_evenly_chunked(xs, n):
    chunks = chunk(xs, n)
    assert len(chunks) == n
    chunk_lens = {len(c) for c in chunks}
    assert len(chunk_lens) in {1, 2}
    assert max(chunk_lens) - min(chunk_lens) in {0, 1}

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Again, the test passes.&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;py.test
============================= test session starts =============================
platform win32 -- Python 3.5.3, pytest-3.0.7, py-1.4.33, pluggy-0.4.0
rootdir: /work/sliced-python, inifile:
plugins: hypothesis-3.8.3
collected 1 items

test_chunk.py .

========================== 1 passed in 0.30 seconds ===========================

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;This output is rather inscrutable. Generally, passing tests shouldn&amp;#8217;t draw attention to themselves, but what inputs had my test strategies generated? Were they sufficient?&lt;/p&gt;
&lt;p&gt;A commandline switch provides a little more detail.&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;py.test --hypothesis-show-statistics

....

test_chunk.py::test_evenly_chunked:

  - 200 passing examples, 0 failing examples, 0 invalid examples
  - Typical runtimes: &amp;lt; 1ms
  - Stopped because settings.max_examples=200

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;It&amp;#8217;s also possible to peek at examples produced by the test strategy.&lt;/p&gt;
&lt;pre&gt;
&amp;gt;&amp;gt;&amp;gt; s=st.one_of(st.text(), st.binary(), st.lists(st.integers()))
&amp;gt;&amp;gt;&amp;gt; s.example()
b&#x27;&#x27;
&amp;gt;&amp;gt;&amp;gt; s.example()
b&#x27;\xc2\xfd6[&#x27;
&amp;gt;&amp;gt;&amp;gt; s.example()
&#x27;:\n&amp;uacute;&amp;amp;\U000ea7e8&#x27;
&amp;gt;&amp;gt;&amp;gt; s.example()
b&#x27;\xe7z&#x27;
&amp;gt;&amp;gt;&amp;gt; s.example()
&#x27;&#x27;
&amp;gt;&amp;gt;&amp;gt; s.example()
[184, 36, -205, 1486638017]
&lt;/pre&gt;

&lt;h2 id=&quot;complete-test-suite&quot;&gt;Complete test suite&lt;/h2&gt;
&lt;p&gt;Here&amp;#8217;s my final test suite. Rather than hard code a maximum value for &lt;code&gt;n&lt;/code&gt;, I used a &lt;a href=&quot;https://hypothesis.readthedocs.io/en/latest/data.html#composite-strategies&quot;&gt;composite strategy&lt;/a&gt; which adapts &lt;code&gt;n&lt;/code&gt; to the size of &lt;code&gt;xs&lt;/code&gt;. I&amp;#8217;ve also added a test which confirms the result does comprise chunks of the input sequence.&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;import functools

import hypothesis as ht
import hypothesis.strategies as st

@st.composite
def items_and_chunk_count(draw):
    xs = draw(st.one_of(st.text(),
                        st.binary(),
                        st.lists(st.integers())))
    n = draw(st.integers(min_value=1,
                         max_value=max(1, len(xs))))
    return xs, n


@ht.given(xs_n=items_and_chunk_count())
def test_evenly_chunked(xs_n):
    &amp;#x27;&amp;#x27;&amp;#x27;Verify there are n evenly sized chunks&amp;#x27;&amp;#x27;&amp;#x27;
    xs, n = xs_n
    chunks = chunk(xs, n)
    assert len(chunks) == n
    chunk_lens = {len(c) for c in chunks}
    assert len(chunk_lens) in {1, 2}
    assert max(chunk_lens) - min(chunk_lens) in {0, 1}


@ht.given(xs_n=items_and_chunk_count())
def test_combining_chunks(xs_n):
    &amp;#x27;&amp;#x27;&amp;#x27;Verify recombining the chunks reproduces the original sequence.&amp;#x27;&amp;#x27;&amp;#x27;
    xs, n = xs_n
    chunks = chunk(xs, n)
    assert functools.reduce(lambda x, y: x+y, chunks) == xs

&lt;/pre&gt;

&lt;/div&gt;

&lt;h2 id=&quot;quality-of-failure&quot;&gt;Quality of failure&lt;/h2&gt;
&lt;p&gt;In the comments to my original article Mike Edey put forward an elegant solution to the original problem of evenly subdividing a sequence into an exact number of chunks:&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;def chunk(xs, n):
    return [xs[index::n] for index in range(n)]

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;This is a delightful piece piece of code, and an approach I simply hadn&amp;#8217;t considered. If the input list &lt;code&gt;xs&lt;/code&gt; represents a number of tasks to be distributed amongst &lt;code&gt;n&lt;/code&gt; workers, this does the job evenly. In my actual motivating example, though, however, the input sequence was a document which caused a problem, and what I wanted to do was split that document up into a number of sections and see which of these exhibited the same problem: that is, I needed the chunks to be contiguous blocks of text from the original document. This is the property which &lt;code&gt;test_combining_chunks()&lt;/code&gt; checks.&lt;/p&gt;
&lt;p&gt;Running Mike Edey&amp;#8217;s implementation through the test suite, we get:&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;py.test
============================= test session starts =============================
platform win32 -- Python 3.5.3, pytest-3.0.7, py-1.4.33, pluggy-0.4.0
rootdir: /work/sliced-python, inifile:
plugins: hypothesis-3.8.3
collected 2 items

test_chunk.py .F

================================== FAILURES ===================================
____________________________ test_combining_chunks ____________________________

    @ht.given(xs_n=items_and_chunk_count())
&amp;gt;   def test_combining_chunks(xs_n):

test_chunk.py:29: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
d:\venvs\slackbot\lib\site-packages\hypothesis\core.py:634: in wrapped_test
    state.run()
d:\venvs\slackbot\lib\site-packages\hypothesis\core.py:531: in run
    print_example=True, is_final=True
d:\venvs\slackbot\lib\site-packages\hypothesis\executors.py:58: in default_new_style_executor
    return function(data)
d:\venvs\slackbot\lib\site-packages\hypothesis\core.py:113: in run
    return test(*args, **kwargs)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

xs_n = (&amp;#x27;001&amp;#x27;, 2)

    @ht.given(xs_n=items_and_chunk_count())
    def test_combining_chunks(xs_n):
        &amp;#x27;&amp;#x27;&amp;#x27;Verify recombining the chunks reproduces the original sequence.&amp;#x27;&amp;#x27;&amp;#x27;
        xs, n = xs_n
        chunks = chunk(xs, n)
&amp;gt;       assert functools.reduce(lambda x, y: x+y, chunks) == xs
E       AssertionError: assert &amp;#x27;010&amp;#x27; == &amp;#x27;001&amp;#x27;
E         - 010
E         + 001

test_chunk.py:33: AssertionError
--------------------------------- Hypothesis ----------------------------------
Falsifying example: test_combining_chunks(xs_n=(&amp;#x27;001&amp;#x27;, 2))
===================== 1 failed, 1 passed in 0.52 seconds ======================

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Hypothesis has discovered a minimal failing example: the string &lt;code&gt;001&lt;/code&gt; splits into &lt;code&gt;2&lt;/code&gt; chunks as &lt;code&gt;01&lt;/code&gt;, &lt;code&gt;0&lt;/code&gt;.&lt;/p&gt;
&lt;h2 id=&quot;conclusions&quot;&gt;Conclusions&lt;/h2&gt;
&lt;p&gt;Hypothesis worked well for this particular example.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;it forced me to pin down the function specification&lt;/li&gt;
&lt;li&gt;I had to consider the special cases: would the function behave in the face of logically permissable inputs, and not just the ones I had in mind when I wrote it&lt;/li&gt;
&lt;li&gt;it increased my confidence the function was correct&lt;/li&gt;
&lt;li&gt;and particularly appealing, in this case &amp;#8212; the tests were not tied to a detail of the implementation, and would continue to work if, for example, the larger chunks were to appear at the end of the results.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;More generally, I found the hypothesis library solid. It&amp;#8217;s well designed and documented, as well as being a fine example of how to use Python decorators.&lt;/p&gt;
&lt;p&gt;I&amp;#8217;d say property based testing complements example based testing. Example based unit tests show you how a function is used, for instance; with hypothesis, this useful demonstration happens behind the scenes (though note that your hypothesis tests &lt;strong&gt;can&lt;/strong&gt; include explicit &lt;a href=&quot;https://hypothesis.readthedocs.io/en/latest/details.html#providing-explicit-examples&quot;&gt;@examples&lt;/a&gt;). Example based unit tests are typically one or two orders of magnitude quicker to execute. It&amp;#8217;s not a problem if a couple of tests take half a second to run, but what if you have a couple of thousand tests?&lt;/p&gt;
&lt;p&gt;In my case the built-in strategies were good enough to generate inputs to my function. I can imagine that&amp;#8217;s not the case for functions higher up a software stack. Test setup functions can be hard work and I suspect test setup strategies would be harder.&lt;/p&gt;
&lt;p&gt;In closing, I&amp;#8217;d like to quote from the section of the &lt;a href=&quot;https://hypothesis.readthedocs.io/en/latest/manifesto.html&quot;&gt;Hypothesis documentation&lt;/a&gt; which describes its purpose.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Software is, as they say, eating the world. Software is also &lt;a href=&quot;https://www.youtube.com/watch?v=csyL9EC0S0c&quot;&gt;terrible&lt;/a&gt;. It&amp;#8217;s buggy, insecure and generally poorly thought out. This combination is clearly a recipe for disaster.&lt;/p&gt;
&lt;p&gt;And the state of software testing is even worse. Although it&amp;#8217;s fairly uncontroversial at this point that you should be testing your code, can you really say with a straight face that most projects you&amp;#8217;ve worked on are adequately tested?&lt;/p&gt;
&lt;p&gt;A lot of the problem here is that it&amp;#8217;s too hard to write good tests. Your tests encode exactly the same assumptions and fallacies that you had when you wrote the code, so they miss exactly the same bugs that you missed when you wrote the code.&lt;/p&gt;
&lt;p&gt;&amp;#8212; &lt;a href=&quot;https://hypothesis.readthedocs.io/en/latest/manifesto.html&quot;&gt;The Purpose of Hypothesis&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;</description>
<dc:date>2017-05-29</dc:date>
<guid>https://wordaligned.org/articles/unleash-the-test-army</guid>
<author>tag@wordaligned.org (Thomas Guest)</author>
<link>https://wordaligned.org/articles/unleash-the-test-army</link>
<category>Python</category>
</item>

<item>
<title>Lazy sequences working hard</title>
<description>&lt;p&gt;I gave a talk &lt;a href=&quot;http://www.pydiff.wales/events/2017-05-16.html&quot;&gt;@PyDiff&lt;/a&gt; this evening in the computer science department at Cardiff University.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;h2&gt;Lazy Sequences working hard&lt;/h2&gt;&lt;/p&gt;
&lt;p&gt;Python has no problem handling large and even infinite streams of data. Just write lazy programs &amp;#8212; code which defers data access until the last minute. This talk examines Python&amp;#8217;s language and library support for such delaying tactics. There will be live coding, and we&amp;#8217;ll draw parallels with similar features in other languages, in particular the Unix shell.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Being unsure where to pitch it, I started off easy and kept going until I&amp;#8217;d lost everybody &amp;#8212; including myself.&lt;/p&gt;
&lt;p&gt;The room was well set up with a good quality projector and whiteboard, along with a desk to sit down when I wanted to write and run code and plenty of space to move around in otherwise. I did feel a bit like a jack-in-the-box by the end.&lt;/p&gt;
&lt;p&gt;&lt;a data-flickr-embed=&quot;true&quot;  href=&quot;https://www.flickr.com/photos/thomasguest/34663534326/in/dateposted-friend/&quot; title=&quot;Me @PyDiff&quot;&gt;&lt;img src=&quot;https://c1.staticflickr.com/5/4164/34663534326_4bf8851eeb.jpg&quot; width=&quot;500&quot; height=&quot;375&quot; alt=&quot;Me @PyDiff&quot;&gt;&lt;/a&gt;&lt;script async src=&quot;https://wordaligned.org//embedr.flickr.com/assets/client-code.js&quot; charset=&quot;utf-8&quot;&gt;&lt;/script&gt;&lt;/p&gt;
&lt;p&gt;I&amp;#8217;d based the talk on a Jupyter notebook which I replayed with the ingenious &lt;a href=&quot;https://github.com/damianavila/RISE&quot;&gt;RISE reveal.js extension&lt;/a&gt; written by Damian Avila. This worked well, since I got the pretty graphics along with the interactive coding. A static version of the slides is &lt;a href=&quot;http://nbviewer.jupyter.org/url/wordaligned.org/docs/lazy-sequences.ipynb&quot;&gt;available here&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Thanks to everyone who came. Sorry I had to rush off after. If anyone would &lt;a href=&quot;http://swanseasdc.co.uk/&quot;&gt;like to talk at Swansea&lt;/a&gt;, please let me know: you&amp;#8217;d be most welcome.&lt;/p&gt;</description>
<dc:date>2017-05-16</dc:date>
<guid>https://wordaligned.org/articles/lazy-sequences-working-hard</guid>
<author>tag@wordaligned.org (Thomas Guest)</author>
<link>https://wordaligned.org/articles/lazy-sequences-working-hard</link>
<category>Python</category>
</item>

<item>
<title>Slicing a list evenly with Python</title>
<description>&lt;p&gt;&lt;img src=&quot;https://wordaligned.org/images/sliced-python.jpg&quot; alt=&quot;Sliced Python&quot;/&gt;&lt;/p&gt;
&lt;p&gt;Here&amp;#8217;s a problem I came up against recently.&lt;/p&gt;
&lt;p&gt;The task was to chop a list into exactly &lt;code&gt;n&lt;/code&gt; evenly slized chunks. To give a little more context, let&amp;#8217;s suppose we want to divide a list of jobs equally between &lt;code&gt;n&lt;/code&gt; workers, where &lt;code&gt;n&lt;/code&gt; might be the number of CPU cores available.&lt;/p&gt;
&lt;p&gt;We can build the result by repeatedly slicing the input:&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;def chunk(xs, n):
    &amp;#x27;&amp;#x27;&amp;#x27;Split the list, xs, into n chunks&amp;#x27;&amp;#x27;&amp;#x27;
    L = len(xs)
    assert 0 &amp;lt; n &amp;lt;= L
    s = L//n
    return [xs[p:p+s] for p in range(0, L, s)]

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;This looks promising&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;&amp;gt;&amp;gt;&amp;gt; chunk(&amp;#x27;abcdefghi&amp;#x27;, 3)
[&amp;#x27;abc&amp;#x27;, &amp;#x27;def&amp;#x27;, &amp;#x27;ghi&amp;#x27;]

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;but if the size of the list is not an exact multiple of &lt;code&gt;n&lt;/code&gt;, the result won&amp;#8217;t contain &lt;strong&gt;exactly&lt;/strong&gt; &lt;code&gt;n&lt;/code&gt; chunks.&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;&amp;gt;&amp;gt;&amp;gt; chunk(&amp;#x27;abcde&amp;#x27;, 3)
[&amp;#x27;a&amp;#x27;, &amp;#x27;b&amp;#x27;, &amp;#x27;c&amp;#x27;, &amp;#x27;d&amp;#x27;, &amp;#x27;e&amp;#x27;]
&amp;gt;&amp;gt;&amp;gt; chunk(&amp;#x27;abcdefgh&amp;#x27;, 3)
[&amp;#x27;ab&amp;#x27;, &amp;#x27;cd&amp;#x27;, &amp;#x27;ef&amp;#x27;, &amp;#x27;gh&amp;#x27;]
&amp;gt;&amp;gt;&amp;gt; chunk(&amp;#x27;abcdefghij&amp;#x27;, 3)
[&amp;#x27;abc&amp;#x27;, &amp;#x27;def&amp;#x27;, &amp;#x27;ghi&amp;#x27;, &amp;#x27;j&amp;#x27;]

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;(By the way, I&amp;#8217;m using strings rather than lists in the examples. The code works equally well for both types, and strings make it slightly easier to see what&amp;#8217;s going on.)&lt;/p&gt;
&lt;p&gt;One way to fix the problem is to group the final chunks together.&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;def chunk(xs, n):
    &amp;#x27;&amp;#x27;&amp;#x27;Split the list, xs, into n chunks&amp;#x27;&amp;#x27;&amp;#x27;
    L = len(xs)
    assert 0 &amp;lt; n &amp;lt;= L
    s, r = divmod(L, n)
    chunks = [xs[p:p+s] for p in range(0, L, s)]
    chunks[n-1:] = [xs[-r-s:]]
    return chunks

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Now we have exactly &lt;code&gt;n&lt;/code&gt; chunks, but they may not be evenly sized, since the last chunk gets padded with any surplus.&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;&amp;gt;&amp;gt;&amp;gt; chunk(&amp;#x27;abcde&amp;#x27;, 3)
[&amp;#x27;a&amp;#x27;, &amp;#x27;b&amp;#x27;, &amp;#x27;cde&amp;#x27;]
&amp;gt;&amp;gt;&amp;gt; chunk(&amp;#x27;abcdefgh&amp;#x27;, 3)
[&amp;#x27;ab&amp;#x27;, &amp;#x27;cd&amp;#x27;, &amp;#x27;efgh&amp;#x27;]
&amp;gt;&amp;gt;&amp;gt; chunk(&amp;#x27;abcdefghij&amp;#x27;, 3)
[&amp;#x27;abc&amp;#x27;, &amp;#x27;def&amp;#x27;, &amp;#x27;ghij&amp;#x27;]

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;What does &amp;#8220;evenly sized&amp;#8221; actually mean? Loosely speaking, we want the resulting chunks as closely sized as possible.&lt;/p&gt;
&lt;p&gt;More precisely, if the result of dividing the length of the list &lt;code&gt;L&lt;/code&gt; by the number of chunks &lt;code&gt;n&lt;/code&gt; gives a size &lt;code&gt;s&lt;/code&gt; with remainder &lt;code&gt;r&lt;/code&gt;, then the function should return &lt;code&gt;r&lt;/code&gt; chunks of size &lt;code&gt;s+1&lt;/code&gt; and &lt;code&gt;n-r&lt;/code&gt; chunks of size &lt;code&gt;s&lt;/code&gt;. There are &lt;a href=&quot;https://en.wikipedia.org/wiki/Combination&quot;&gt;choose(n, r)&lt;/a&gt; ways of doing this. Here&amp;#8217;s a solution which puts the longer chunks to the front of the results.&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;def chunk(xs, n):
    &amp;#x27;&amp;#x27;&amp;#x27;Split the list, xs, into n evenly sized chunks&amp;#x27;&amp;#x27;&amp;#x27;
    L = len(xs)
    assert 0 &amp;lt; n &amp;lt;= L
    s, r = divmod(L, n)
    t = s + 1
    return ([xs[p:p+t] for p in range(0, r*t, t)] +
            [xs[p:p+s] for p in range(r*t, L, s)])

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Here&amp;#8217;s a second implementation, this time using &lt;code&gt;itertools&lt;/code&gt;. Chaining &lt;code&gt;r&lt;/code&gt; copies of &lt;code&gt;s+1&lt;/code&gt; and &lt;code&gt;n-r&lt;/code&gt; copies of &lt;code&gt;s&lt;/code&gt; gives us the &lt;code&gt;n&lt;/code&gt; chunk widths. Accumulating the widths gives us the list offsets for slicing &amp;#8212; though note we need to prepend an initial &lt;code&gt;0&lt;/code&gt;. Now we can form a &lt;a href=&quot;https://wordaligned.org/articles/zippy-triples-served-with-python&quot;&gt;(this, next) pair of iterators&lt;/a&gt; over the offsets, and the result is the accumulation of repeated &lt;code&gt;(begin, end)&lt;/code&gt; slices taken from the original list.&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;from itertools import accumulate, chain, repeat, tee

def chunk(xs, n):
    assert n &amp;gt; 0
    L = len(xs)
    s, r = divmod(L, n)
    widths = chain(repeat(s+1, r), repeat(s, n-r))
    offsets = accumulate(chain((0,), widths))
    b, e = tee(offsets)
    next(e)
    return [xs[s] for s in map(slice, b, e)]

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;This version does something sensible in the case when the number of slices, &lt;code&gt;n&lt;/code&gt;, exceeds the length of the list.&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;&amp;gt;&amp;gt;&amp;gt; chunk(&amp;#x27;ab&amp;#x27;, 5)
[&amp;#x27;a&amp;#x27;, &amp;#x27;b&amp;#x27;, &amp;#x27;&amp;#x27;, &amp;#x27;&amp;#x27;, &amp;#x27;&amp;#x27;]

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Finally, some tests.&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;def test_chunk():
    assert chunk(&amp;#x27;&amp;#x27;, 1) == [&amp;#x27;&amp;#x27;]
    assert chunk(&amp;#x27;ab&amp;#x27;, 2) == [&amp;#x27;a&amp;#x27;, &amp;#x27;b&amp;#x27;]
    assert chunk(&amp;#x27;abc&amp;#x27;, 2) == [&amp;#x27;ab&amp;#x27;, &amp;#x27;c&amp;#x27;]

    xs = list(range(8))
    assert chunk(xs, 2) == [[0, 1, 2, 3], [4, 5, 6, 7]]
    assert chunk(xs, 3) == [[0, 1, 2], [3, 4, 5], [6, 7]]
    assert chunk(xs, 5) == [[0, 1], [2, 3], [4, 5], [6], [7]]

    rs = range(1000000)
    assert chunk(rs, 2) == [range(500000), range(500000, 1000000)]

&lt;/pre&gt;

&lt;/div&gt;</description>
<dc:date>2017-05-14</dc:date>
<guid>https://wordaligned.org/articles/slicing-a-list-evenly-with-python</guid>
<author>tag@wordaligned.org (Thomas Guest)</author>
<link>https://wordaligned.org/articles/slicing-a-list-evenly-with-python</link>
<category>Python</category>
</item>

<item>
<title>From bytes to strings in Python and back again</title>
<description>&lt;p&gt;Low level languages like C have little opinion about what goes in a string, which is simply a null-terminated sequence of bytes. Those bytes could be ASCII or UTF-8 encoded text, or they could be raw data &amp;#8212; object code, for example. It&amp;#8217;s quite possible and legal to have a C string with mixed content.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;char const * mixed =
    &quot;EURO SIGN &quot;          // ASCII
    &quot;UTF-8 \xE2\x82\xAC &quot; // UTF-8 encoded EURO SIGN
    &quot;Latin-9 \xA4&quot;;       // Latin-9 encoded EURO SIGN
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This might seem indisciplined and risky but it can be useful. Environment variables are notionally text but actually C strings, for example, meaning they can hold whatever data you want. Similarly filenames and command line parameters are only loosely text.&lt;/p&gt;
&lt;p&gt;A higher level language like Python makes a strict distinction between bytes and strings. Bytes objects contain raw data &amp;#8212; a sequence of octets &amp;#8212; whereas strings are Unicode sequences. Conversion between the two types is explicit: you encode a string to get bytes, specifying an encoding (which defaults to UTF-8); and you decode bytes to get a string. Clients of these functions should be aware that such conversions may fail, and should consider how failures are handled.&lt;/p&gt;
&lt;p&gt;Simply put, a string in Python is a valid Unicode sequence. Real world text data may not be. Programmers need to take charge of reconciling any discrepancies.&lt;/p&gt;
&lt;p&gt;We faced such problems recently &lt;a href=&quot;https://clinithink.com&quot;&gt;at work&lt;/a&gt;. We&amp;#8217;re in the business of extracting meaning from clinical narratives &amp;#8212; text data stored on medical records systems in hospitals, for example. These documents may well have passed through a variety of systems. They may be unclear about their text encoding. They may not be encoded as they claim. So what? They can and do contain abbreviations, mispellings, jargon and colloquialisms. Refining the signal from such noise is our core business: if we can correctly interpret positional and temporal aspects of a sentence such as:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Previous fracture of left neck of femur&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;then we can surely deal with text which claims to be UTF-8 encoded but isn&amp;#8217;t really.&lt;/p&gt;
&lt;p&gt;Our application stack is server-based: a REST API to a Python application handles document ingest; lower down, a C++ engine does the actual document processing. The problem we faced was supporting a modern API capable of handling real world data.&lt;/p&gt;
&lt;p&gt;It&amp;#8217;s both undesirable and unnecessary to require clients to clean their text before submitting it. We want to make the ingest direct and idiomatic. Also, we shouldn&amp;#8217;t penalise clients whose data &lt;strong&gt;is&lt;/strong&gt; clean. Thus document upload is an HTTP POST request, and the document content is a JSON string &amp;#8212; rather than, say, base64 encoded binary data. Our server, however, will be permissive about the contents of this string.&lt;/p&gt;
&lt;p&gt;So far so good. &lt;a href=&quot;http://www.catb.org/jargon/html/P/Postels-Prescription.html&quot; title=&quot;Postel&#x27;s prescription&quot;&gt;Postel&amp;#8217;s prescription&lt;/a&gt; advises:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Be liberal in what you accept, and conservative in what you send.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This would suggest accepting messy text data but presenting it in a cleaned up form. In our case, we &lt;strong&gt;do&lt;/strong&gt; normalise the input data &amp;#8212; a process which includes detecting and standardising date/time information, expanding abbreviations, fixing typos and so on &amp;#8212; but this normalised form links back to a faithful copy of the original data. What gets presented to the user is their own text annotated with our findings. That is, we subscribe to a &lt;a href=&quot;https://en.wikipedia.org/wiki/Garbage_in,_garbage_out&quot;&gt;more primitive prescription&lt;/a&gt; than Postel&amp;#8217;s:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Garbage in, garbage out&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;with the caveat that the garbage shouldn&amp;#8217;t be damaged in transit.&lt;/p&gt;
&lt;p&gt;Happily, there is a simple way to pass dodgy strings through Python. It&amp;#8217;s used in the standard library to handle text data which isn&amp;#8217;t guaranteed to be clean &amp;#8212; those environment variables, command line parameters, and filenames for example.&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;surrogateescape&lt;/code&gt; error handler smuggles non-decodable bytes into the (Unicode) Python string in such a way that the original bytes can be recovered on encode, as described in &lt;a href=&quot;https://www.python.org/dev/peps/pep-0383/&quot; title=&quot;PEP 383 -- Non-decodable Bytes in System Character Interfaces&quot;&gt;PEP 383&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;On POSIX systems, Python currently applies the locale&amp;#8217;s encoding to convert the byte data to Unicode, failing for characters that cannot be decoded. With &lt;a href=&quot;https://www.python.org/dev/peps/pep-0383/&quot; title=&quot;PEP 383 -- Non-decodable Bytes in System Character Interfaces&quot;&gt;this PEP&lt;/a&gt;, non-decodable bytes &amp;gt;= 128 will be represented as lone surrogate codes U+DC80..U+DCFF.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This workaround is possible because Unicode surrogates are intended for use in pairs. Quoting the Unicode specification, they &lt;a href=&quot;http://www.unicode.org/versions/Unicode9.0.0/ch03.pdf#G2630&quot;&gt;&amp;#8220;have no interpretation on their own&amp;#8221;&lt;/a&gt;. The lone trailing surrogate code &amp;#8212; the half-a-pair &amp;#8212; can only be the result of a &lt;code&gt;surrogateescape&lt;/code&gt; error handler being invoked, and the original bytes can be recovered by using the same error handler on encode.&lt;/p&gt;
&lt;p&gt;In conclusion, text data is handled differently in C++ and Python, posing a problem for layered applications. The &lt;code&gt;surrogateescape&lt;/code&gt; error handler provides a standard and robust way of closing the gap.&lt;/p&gt;
&lt;h2 id=&quot;unicode-surrogate-pairs&quot;&gt;Unicode Surrogate Pairs&lt;/h2&gt;
&lt;p&gt;&lt;a href=&quot;http://www.unicode.org/versions/Unicode9.0.0/ch03.pdf#G2630&quot;&gt;&lt;img src=&quot;https://wordaligned.org/images/unicode-surrogates.png&quot; alt=&quot;Surrogates&quot;/&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2 id=&quot;code-listing&quot;&gt;Code Listing&lt;/h2&gt;
&lt;pre&gt;&lt;code&gt;&amp;gt;&amp;gt;&amp;gt; mixed = b&quot;EURO SIGN \xE2\x82\xAC \xA4&quot;
&amp;gt;&amp;gt;&amp;gt; mixed
b&#x27;EURO SIGN \xe2\x82\xac \xa4&#x27;
&amp;gt;&amp;gt;&amp;gt; mixed.decode()
Traceback (most recent call last):
  File &quot;&amp;lt;stdin&amp;gt;&quot;, line 1, in &amp;lt;module&amp;gt;
UnicodeDecodeError: &#x27;utf-8&#x27; codec can&#x27;t decode byte 0xa4 in position 14:
  invalid start byte
&amp;gt;&amp;gt;&amp;gt; help(mixed.decode)
Help on built-in function decode:

decode(encoding=&#x27;utf-8&#x27;, errors=&#x27;strict&#x27;) method of builtins.bytes instance
    Decode the bytes using the codec registered for encoding.

    encoding
      The encoding with which to decode the bytes.
    errors
      The error handling scheme to use for the handling of decoding errors.
      The default is &#x27;strict&#x27; meaning that decoding errors raise a
      UnicodeDecodeError. Other possible values are &#x27;ignore&#x27; and &#x27;replace&#x27;
      as well as any other name registered with codecs.register_error that
      can handle UnicodeDecodeErrors.

&amp;gt;&amp;gt;&amp;gt; mixed.decode(errors=&#x27;surrogateescape&#x27;)
&#x27;EURO SIGN &amp;euro; \udca4&#x27;
&amp;gt;&amp;gt;&amp;gt; s = mixed.decode(errors=&#x27;surrogateescape&#x27;)
&amp;gt;&amp;gt;&amp;gt; s.encode()
Traceback (most recent call last):
  File &quot;&amp;lt;stdin&amp;gt;&quot;, line 1, in &amp;lt;module&amp;gt;
UnicodeEncodeError: &#x27;utf-8&#x27; codec can&#x27;t encode character &#x27;\udca4&#x27; in position 12:
  surrogates not allowed
&amp;gt;&amp;gt;&amp;gt; s.encode(errors=&#x27;surrogateescape&#x27;)
b&#x27;EURO SIGN \xe2\x82\xac \xa4&#x27;
&lt;/code&gt;&lt;/pre&gt;</description>
<dc:date>2017-03-24</dc:date>
<guid>https://wordaligned.org/articles/from-bytes-to-strings-in-python-and-back-again</guid>
<author>tag@wordaligned.org (Thomas Guest)</author>
<link>https://wordaligned.org/articles/from-bytes-to-strings-in-python-and-back-again</link>
<category>Python</category>
</item>

<item>
<title>24 Puzzles</title>
<description>&lt;p&gt;On &lt;a href=&quot;http://blog.plover.com&quot;&gt;The Universe of Discourse&lt;/a&gt; Mark Dominus discusses the classic &lt;a href=&quot;http://blog.plover.com/math/24-puzzle.html&quot;&gt;24 Puzzle&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;You are given a sequence of four digits, say 1, 2, 3, 4, and your job is to combine them with ordinary arithmetic operations (+, -, &amp;times;, and &amp;divide;) in any order to make a target number, typically 24. For example, with 1, 2, 3, 4, you can go with &lt;code&gt;((1+2)+3)&amp;times;4=24&lt;/code&gt; or with &lt;code&gt;4&amp;times;((2&amp;times;3)&amp;times;1)=24&lt;/code&gt;.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Here&amp;#8217;s a solver for such puzzles. It uses &lt;code&gt;itertools&lt;/code&gt; to generate possible expressions, &lt;code&gt;fractions&lt;/code&gt; to get the arithmetic right, and &lt;code&gt;eval&lt;/code&gt; to evaluate expressions. It&amp;#8217;s limited to expressions formed from 4 numbers which means I don&amp;#8217;t have to programmatically calculate different ways of parenthesising: there are only 5.&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;# http://blog.plover.com/math/24-puzzle.html
import re
import math

# Use fractions for exact calculations
from fractions import Fraction

# Solve for 4 numbers only!
N = 4 

# So these are the only expression templates
# where X is a number and @ is an operation
templates = &amp;#x27;&amp;#x27;&amp;#x27;\
((X @ X) @ X) @ X
(X @ (X @ X)) @ X
X @ ((X @ X) @ X)
X @ (X @ (X @ X))
(X @ X) @ (X @ X)&amp;#x27;&amp;#x27;&amp;#x27;.splitlines()

import itertools as its

def defrac(s):
    return re.compile(r&amp;#x27;Fraction\((\d+)\)&amp;#x27;).sub(r&amp;#x27;\1&amp;#x27;, s)

def evaluate(nums, ops, template):
    fracs = (&amp;#x27;Fraction(%s)&amp;#x27; % n for n in nums)
    ops = iter(ops)
    expr = &amp;#x27;&amp;#x27;.join(next(fracs) if c == &amp;#x27;X&amp;#x27; else
                   next(ops) if c == &amp;#x27;@&amp;#x27; else c
                   for c in template)
    try:
        return expr, eval(expr)
    except ZeroDivisionError:
        return expr, None        

def solve(spec, ops):
    numbers = re.compile(r&amp;#x27;\d+&amp;#x27;).findall(spec)
    assert len(numbers) == N + 1
    result = Fraction(numbers.pop())
    seqs = its.product(its.permutations(numbers),
                       its.product(ops, repeat=N-1),
                       templates)
    print(defrac(next((e for e, v in its.starmap(evaluate, seqs)
                       if v == result),
                      &amp;#x27;Impossible&amp;#x27;)))

def main():
    solve(&amp;#x27;2,5,6,6 =&amp;gt; 17&amp;#x27;, &amp;#x27;+-/*&amp;#x27;)
    solve(&amp;#x27;3,3,8,8 =&amp;gt; 24&amp;#x27;, &amp;#x27;+-/*&amp;#x27;)

main()

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Here&amp;#8217;s a second attempt, which doesn&amp;#8217;t assume there will just be 4 numbers on the left hand side of the equation. Given a sequence of numbers and a set of operators, it repeatedly reduces the sequence length by picking pair of numbers and combining them using one of the operators, iterating over all possible ways of doing this. The first sequence of length 1 which equals the target value gives a solution and terminates the search.&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;# http://blog.plover.com/math/24-puzzle.html
from fractions import Fraction
import itertools as its
import operator
import re

def pick2(xs):
    return ((p[:2], p[2:]) for p in its.permutations(xs))

def allow(op, l, r):
    return op != &amp;#x27;/&amp;#x27; or eval(r) != 0

def apply(op, l, r):
    return &amp;#x27;(%s%s%s)&amp;#x27;%(l, op, r)

def values(xs, ops):
    L = [xs]
    while L:
        xs = L.pop()
        if len(xs) == 1:
            yield xs.pop()
        else:
            L.extend([apply(op, *lr)] + list(tl)
                     for op, (lr, tl) in its.product(ops, pick2(xs))
                     if allow(op, *lr))

def solve(spec, ops):
    numbers = [&amp;#x27;Fraction(%s)&amp;#x27;%n for n in re.compile(r&amp;#x27;\d+&amp;#x27;).findall(spec)]
    target = eval(numbers.pop())
    print(next((v for v in values(numbers, ops) if eval(v) == target), &amp;#x27;Impossible&amp;#x27;))

def main():
    solve(&amp;#x27;2,5,6,6 =&amp;gt; 17&amp;#x27;, &amp;#x27;+-/*&amp;#x27;)
    solve(&amp;#x27;3,3,8,8 =&amp;gt; 24&amp;#x27;, &amp;#x27;+-/*&amp;#x27;)

main()

&lt;/pre&gt;

&lt;/div&gt;</description>
<dc:date>2017-03-08</dc:date>
<guid>https://wordaligned.org/articles/24-puzzles</guid>
<author>tag@wordaligned.org (Thomas Guest)</author>
<link>https://wordaligned.org/articles/24-puzzles</link>
<category>Python</category>
</item>

<item>
<title>Negative Sequence Indices in Python</title>
<description>&lt;p&gt;Supply a negative index when accessing a sequence and Python counts back from the end. So, for example, &lt;code&gt;my_list[-2]&lt;/code&gt; is the penultimate element of &lt;code&gt;my_list&lt;/code&gt;, which is much better than &lt;code&gt;my_list[len(my_list)-2]&lt;/code&gt; or even &lt;code&gt;*(++my_list.rbegin())&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;That final example uses one of C++&amp;#8217;s reverse iterators. It gets the penultimate element of a collection by advancing an iterator from the reversed beginning of that collection. If you&amp;#8217;re addicted to negative indices you &lt;strong&gt;can&lt;/strong&gt; use them with C++ arrays, sort of.&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;&lt;div class=&quot;codetitle&quot;&gt;Negative array indices in C++&lt;/div&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;#include &amp;lt;iostream&amp;gt;

int main()
{
    char const * domain = &quot;wordaligned.org&quot;;
    char const * end = domain + strlen(domain);
    std::cout &amp;lt;&amp;lt; end[-3] &amp;lt;&amp;lt; end[-2] &amp;lt;&amp;lt; end[-1] &amp;lt;&amp;lt; &amp;#x27;\n&amp;#x27;;
    return 0;
}

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Compiling and running this program outputs the string &amp;#8220;org&amp;#8221;.&lt;/p&gt;
&lt;p&gt;Going back to Python, the valid indices into a sequence of length &lt;code&gt;L&lt;/code&gt; are &lt;code&gt;-L&lt;/code&gt;, &lt;code&gt;-L+1&lt;/code&gt;, &amp;#8230; , &lt;code&gt;0&lt;/code&gt;, &lt;code&gt;1&lt;/code&gt;, &amp;#8230; &lt;code&gt;L-1&lt;/code&gt;. Whenever you write code to calculate an index used for accessing a sequence, and especially if you&amp;#8217;re catching any resulting &lt;code&gt;IndexError&lt;/code&gt;s, it&amp;#8217;s worth checking if the result of the calculation can be negative, and if &amp;#8212; in this case &amp;#8212; you really do want the from-the-back indexing behaviour.&lt;/p&gt;
&lt;div&gt;&lt;div style=&quot;line-height: 30px;font-family: monospace;text-align:center;border: 1px solid black;width: 30px;height:30px;display:inline-block;&quot;&gt;&amp;nbsp;&lt;/div&gt;&lt;div style=&quot;line-height: 30px;font-family: monospace;text-align:center;border: 1px solid black;width: 30px;height:30px;display:inline-block;&quot;&gt;&amp;nbsp;&lt;/div&gt;&lt;div style=&quot;line-height: 30px;font-family: monospace;text-align:center;border: 1px solid black;width: 30px;height:30px;display:inline-block;&quot;&gt;&amp;nbsp;&lt;/div&gt;&lt;div style=&quot;line-height: 30px;font-family: monospace;text-align:center;border: 1px solid black;width: 30px;height:30px;display:inline-block;&quot;&gt;&amp;nbsp;&lt;/div&gt;&lt;div style=&quot;line-height: 30px;font-family: monospace;text-align:center;border: 1px solid black;width: 30px;height:30px;display:inline-block;&quot;&gt;0&lt;/div&gt;&lt;div style=&quot;line-height: 30px;font-family: monospace;text-align:center;border: 1px solid black;width: 30px;height:30px;display:inline-block;&quot;&gt;1&lt;/div&gt;&lt;div style=&quot;line-height: 30px;font-family: monospace;text-align:center;border: 1px solid black;width: 30px;height:30px;display:inline-block;&quot;&gt;2&lt;/div&gt;&lt;div style=&quot;line-height: 30px;font-family: monospace;text-align:center;border: 1px solid black;width: 30px;height:30px;display:inline-block;&quot;&gt;3&lt;/div&gt;&lt;div style=&quot;line-height: 30px;font-family: monospace;text-align:center;border: 1px solid black;width: 30px;height:30px;display:inline-block;&quot;&gt;4&lt;/div&gt;&lt;div style=&quot;line-height: 30px;font-family: monospace;text-align:center;border: 1px solid black;width: 30px;height:30px;display:inline-block;&quot;&gt;5&lt;/div&gt;&lt;div style=&quot;line-height: 30px;font-family: monospace;text-align:center;border: 1px solid black;width: 30px;height:30px;display:inline-block;&quot;&gt;6&lt;/div&gt;&lt;div style=&quot;line-height: 30px;font-family: monospace;text-align:center;border: 1px solid black;width: 30px;height:30px;display:inline-block;&quot;&gt;7&lt;/div&gt;&lt;div style=&quot;line-height: 30px;font-family: monospace;text-align:center;border: 1px solid black;width: 30px;height:30px;display:inline-block;&quot;&gt;8&lt;/div&gt;&lt;div style=&quot;line-height: 30px;font-family: monospace;text-align:center;border: 1px solid black;width: 30px;height:30px;display:inline-block;&quot;&gt;9&lt;/div&gt;&lt;div style=&quot;line-height: 30px;font-family: monospace;text-align:center;border: 1px solid black;width: 30px;height:30px;display:inline-block;&quot;&gt;10&lt;/div&gt;&lt;div style=&quot;line-height: 30px;font-family: monospace;text-align:center;border: 1px solid black;width: 30px;height:30px;display:inline-block;&quot;&gt;11&lt;/div&gt;&lt;div style=&quot;line-height: 30px;font-family: monospace;text-align:center;border: 1px solid black;width: 30px;height:30px;display:inline-block;&quot;&gt;12&lt;/div&gt;&lt;div style=&quot;line-height: 30px;font-family: monospace;text-align:center;border: 1px solid black;width: 30px;height:30px;display:inline-block;&quot;&gt;13&lt;/div&gt;&lt;div style=&quot;line-height: 30px;font-family: monospace;text-align:center;border: 1px solid black;width: 30px;height:30px;display:inline-block;&quot;&gt;&amp;hellip;&lt;/div&gt;&lt;/div&gt;
&lt;div&gt;&lt;div style=&quot;line-height: 30px;font-family: monospace;text-align:center;border: 1px solid black;width: 30px;height:30px;display:inline-block;&quot;&gt;&amp;nbsp;&lt;/div&gt;&lt;div style=&quot;line-height: 30px;font-family: monospace;text-align:center;border: 1px solid black;width: 30px;height:30px;display:inline-block;&quot;&gt;&amp;nbsp;&lt;/div&gt;&lt;div style=&quot;line-height: 30px;font-family: monospace;text-align:center;border: 1px solid black;width: 30px;height:30px;display:inline-block;&quot;&gt;&amp;nbsp;&lt;/div&gt;&lt;div style=&quot;line-height: 30px;font-family: monospace;text-align:center;border: 1px solid black;width: 30px;height:30px;display:inline-block;&quot;&gt;&amp;nbsp;&lt;/div&gt;&lt;div style=&quot;line-height: 30px;font-family: monospace;text-align:center;border: 1px solid black;width: 30px;height:30px;display:inline-block;&quot;&gt;W&lt;/div&gt;&lt;div style=&quot;line-height: 30px;font-family: monospace;text-align:center;border: 1px solid black;width: 30px;height:30px;display:inline-block;&quot;&gt;O&lt;/div&gt;&lt;div style=&quot;line-height: 30px;font-family: monospace;text-align:center;border: 1px solid black;width: 30px;height:30px;display:inline-block;&quot;&gt;R&lt;/div&gt;&lt;div style=&quot;line-height: 30px;font-family: monospace;text-align:center;border: 1px solid black;width: 30px;height:30px;display:inline-block;&quot;&gt;D&lt;/div&gt;&lt;div style=&quot;line-height: 30px;font-family: monospace;text-align:center;border: 1px solid black;width: 30px;height:30px;display:inline-block;&quot;&gt;A&lt;/div&gt;&lt;div style=&quot;line-height: 30px;font-family: monospace;text-align:center;border: 1px solid black;width: 30px;height:30px;display:inline-block;&quot;&gt;L&lt;/div&gt;&lt;div style=&quot;line-height: 30px;font-family: monospace;text-align:center;border: 1px solid black;width: 30px;height:30px;display:inline-block;&quot;&gt;I&lt;/div&gt;&lt;div style=&quot;line-height: 30px;font-family: monospace;text-align:center;border: 1px solid black;width: 30px;height:30px;display:inline-block;&quot;&gt;G&lt;/div&gt;&lt;div style=&quot;line-height: 30px;font-family: monospace;text-align:center;border: 1px solid black;width: 30px;height:30px;display:inline-block;&quot;&gt;N&lt;/div&gt;&lt;div style=&quot;line-height: 30px;font-family: monospace;text-align:center;border: 1px solid black;width: 30px;height:30px;display:inline-block;&quot;&gt;E&lt;/div&gt;&lt;div style=&quot;line-height: 30px;font-family: monospace;text-align:center;border: 1px solid black;width: 30px;height:30px;display:inline-block;&quot;&gt;D&lt;/div&gt;&lt;div style=&quot;line-height: 30px;font-family: monospace;text-align:center;border: 1px solid black;width: 30px;height:30px;display:inline-block;&quot;&gt;&amp;nbsp;&lt;/div&gt;&lt;div style=&quot;line-height: 30px;font-family: monospace;text-align:center;border: 1px solid black;width: 30px;height:30px;display:inline-block;&quot;&gt;&amp;nbsp;&lt;/div&gt;&lt;div style=&quot;line-height: 30px;font-family: monospace;text-align:center;border: 1px solid black;width: 30px;height:30px;display:inline-block;&quot;&gt;&amp;nbsp;&lt;/div&gt;&lt;div style=&quot;line-height: 30px;font-family: monospace;text-align:center;border: 1px solid black;width: 30px;height:30px;display:inline-block;&quot;&gt;&amp;nbsp;&lt;/div&gt;&lt;/div&gt;
&lt;div&gt;&lt;div style=&quot;line-height: 30px;font-family: monospace;text-align:center;border: 1px solid black;width: 30px;height:30px;display:inline-block;&quot;&gt;&amp;hellip;&lt;/div&gt;&lt;div style=&quot;line-height: 30px;font-family: monospace;text-align:center;border: 1px solid black;width: 30px;height:30px;display:inline-block;&quot;&gt;-14&lt;/div&gt;&lt;div style=&quot;line-height: 30px;font-family: monospace;text-align:center;border: 1px solid black;width: 30px;height:30px;display:inline-block;&quot;&gt;-13&lt;/div&gt;&lt;div style=&quot;line-height: 30px;font-family: monospace;text-align:center;border: 1px solid black;width: 30px;height:30px;display:inline-block;&quot;&gt;-12&lt;/div&gt;&lt;div style=&quot;line-height: 30px;font-family: monospace;text-align:center;border: 1px solid black;width: 30px;height:30px;display:inline-block;&quot;&gt;-11&lt;/div&gt;&lt;div style=&quot;line-height: 30px;font-family: monospace;text-align:center;border: 1px solid black;width: 30px;height:30px;display:inline-block;&quot;&gt;-10&lt;/div&gt;&lt;div style=&quot;line-height: 30px;font-family: monospace;text-align:center;border: 1px solid black;width: 30px;height:30px;display:inline-block;&quot;&gt;-9&lt;/div&gt;&lt;div style=&quot;line-height: 30px;font-family: monospace;text-align:center;border: 1px solid black;width: 30px;height:30px;display:inline-block;&quot;&gt;-8&lt;/div&gt;&lt;div style=&quot;line-height: 30px;font-family: monospace;text-align:center;border: 1px solid black;width: 30px;height:30px;display:inline-block;&quot;&gt;-7&lt;/div&gt;&lt;div style=&quot;line-height: 30px;font-family: monospace;text-align:center;border: 1px solid black;width: 30px;height:30px;display:inline-block;&quot;&gt;-6&lt;/div&gt;&lt;div style=&quot;line-height: 30px;font-family: monospace;text-align:center;border: 1px solid black;width: 30px;height:30px;display:inline-block;&quot;&gt;-5&lt;/div&gt;&lt;div style=&quot;line-height: 30px;font-family: monospace;text-align:center;border: 1px solid black;width: 30px;height:30px;display:inline-block;&quot;&gt;-4&lt;/div&gt;&lt;div style=&quot;line-height: 30px;font-family: monospace;text-align:center;border: 1px solid black;width: 30px;height:30px;display:inline-block;&quot;&gt;-3&lt;/div&gt;&lt;div style=&quot;line-height: 30px;font-family: monospace;text-align:center;border: 1px solid black;width: 30px;height:30px;display:inline-block;&quot;&gt;-2&lt;/div&gt;&lt;div style=&quot;line-height: 30px;font-family: monospace;text-align:center;border: 1px solid black;width: 30px;height:30px;display:inline-block;&quot;&gt;-1&lt;/div&gt;&lt;div style=&quot;line-height: 30px;font-family: monospace;text-align:center;border: 1px solid black;width: 30px;height:30px;display:inline-block;&quot;&gt;&amp;nbsp;&lt;/div&gt;&lt;div style=&quot;line-height: 30px;font-family: monospace;text-align:center;border: 1px solid black;width: 30px;height:30px;display:inline-block;&quot;&gt;&amp;nbsp;&lt;/div&gt;&lt;div style=&quot;line-height: 30px;font-family: monospace;text-align:center;border: 1px solid black;width: 30px;height:30px;display:inline-block;&quot;&gt;&amp;nbsp;&lt;/div&gt;&lt;div style=&quot;line-height: 30px;font-family: monospace;text-align:center;border: 1px solid black;width: 30px;height:30px;display:inline-block;&quot;&gt;&amp;nbsp;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The power of negative indices increases with slicing. Take a slice of a sequence by supplying begin and end indices.&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;&amp;gt;&amp;gt;&amp;gt; domain = &amp;#x27;wordaligned.org&amp;#x27;
&amp;gt;&amp;gt;&amp;gt; domain[4:9]
&amp;#x27;align&amp;#x27;
&amp;gt;&amp;gt;&amp;gt; domain[4:-4]
&amp;#x27;aligned&amp;#x27;
&amp;gt;&amp;gt;&amp;gt; digits = list(range(10))
&amp;gt;&amp;gt;&amp;gt; digits
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
&amp;gt;&amp;gt;&amp;gt; digits[3:4]
[3]
&amp;gt;&amp;gt;&amp;gt; digits[1:-1]
[1, 2, 3, 4, 5, 6, 7, 8]

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Omitting an index defaults it to the end of the sequence. Omit both indices and both ends of the sequence are defaulted, giving a sliced copy.&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;&amp;gt;&amp;gt;&amp;gt; domain[-3:]
&amp;#x27;org&amp;#x27;
&amp;gt;&amp;gt;&amp;gt; domain[:4]
&amp;#x27;word&amp;#x27;
&amp;gt;&amp;gt;&amp;gt; digits[:]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;I prefer the &lt;code&gt;list(digits)&lt;/code&gt; form for copying &lt;code&gt;digits&lt;/code&gt; but you should certainly be familiar with the &lt;code&gt;digits[:]&lt;/code&gt; version.&lt;/p&gt;
&lt;p&gt;You can supply any indices as slice limits, even ones which wouldn&amp;#8217;t be valid for item access. Imagine laying your sequence out on an indexed chopping board, slicing it at the specified points, then taking whatever lies between these points.&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;&amp;gt;&amp;gt;&amp;gt; digits[-1000000]
Traceback (most recent call last):
  File &quot;&amp;lt;stdin&amp;gt;&quot;, line 1, in &amp;lt;module&amp;gt;
IndexError: list index out of range
&amp;gt;&amp;gt;&amp;gt; digits[1000000]
Traceback (most recent call last):
  File &quot;&amp;lt;stdin&amp;gt;&quot;, line 1, in &amp;lt;module&amp;gt;
IndexError: list index out of range
&amp;gt;&amp;gt;&amp;gt; digits[-1000000:1000000]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Sequence slicing also takes a step parameter.&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;&amp;gt;&amp;gt;&amp;gt; digits[::2]
[0, 2, 4, 6, 8]
&amp;gt;&amp;gt;&amp;gt; digits[1::2]
[1, 3, 5, 7, 9]

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;This parameter too can be negative. The sign of the step affects which limiting values the &lt;code&gt;begin&lt;/code&gt; and &lt;code&gt;end&lt;/code&gt; slice parameters default to. It&amp;#8217;s subtle behaviour, but you soon get used to it.&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;&amp;gt;&amp;gt;&amp;gt; digits[0:10:-2]
[]
&amp;gt;&amp;gt;&amp;gt; digits[::-2]
[9, 7, 5, 3, 1]
&amp;gt;&amp;gt;&amp;gt; digits[-2::-2]
[8, 6, 4, 2, 0]

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;How do you reverse a string? Slice it back to front!&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;&amp;gt;&amp;gt;&amp;gt; domain[::-1]
&amp;#x27;gro.dengiladrow&amp;#x27;

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;To recap: the default slice limits are the start and end of the sequence, &lt;code&gt;0&lt;/code&gt; and &lt;code&gt;-1&lt;/code&gt;, or &lt;code&gt;-1&lt;/code&gt; and &lt;code&gt;0&lt;/code&gt; if the step is negative. The default step is &lt;code&gt;1&lt;/code&gt; whichever way round the limits are. When slicing, &lt;code&gt;s[i:j:k]&lt;/code&gt;, &lt;code&gt;i&lt;/code&gt; and &lt;code&gt;j&lt;/code&gt; may take any integer value, and &lt;code&gt;k&lt;/code&gt; can take any integer value &lt;strong&gt;except&lt;/strong&gt; &lt;code&gt;0&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;The zero value creates another interesting edge case. Here&amp;#8217;s a function to return the last &lt;code&gt;n&lt;/code&gt; items of a sequence.&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;&amp;gt;&amp;gt;&amp;gt; def tail(xs, n)
...     return xs[-n:]
...

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;It fails when &lt;code&gt;n&lt;/code&gt; is &lt;code&gt;0&lt;/code&gt;.&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;&amp;gt;&amp;gt;&amp;gt; tail(digits, 3)
[7, 8, 9]
&amp;gt;&amp;gt;&amp;gt; tail(digits, 2)
[8, 9]
&amp;gt;&amp;gt;&amp;gt; tail(digits, 1)
[9]
&amp;gt;&amp;gt;&amp;gt; tail(digits, 0)
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;By the way, we&amp;#8217;ve already seen slicing working well with lists and strings. It also works nicely with range objects.&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;&amp;gt;&amp;gt;&amp;gt; r = range(10)
&amp;gt;&amp;gt;&amp;gt; r[::2]
range(0, 10, 2)
&amp;gt;&amp;gt;&amp;gt; r[1::2]
range(1, 10, 2)

&lt;/pre&gt;

&lt;/div&gt;</description>
<dc:date>2016-08-01</dc:date>
<guid>https://wordaligned.org/articles/negative-sequence-indices-in-python</guid>
<author>tag@wordaligned.org (Thomas Guest)</author>
<link>https://wordaligned.org/articles/negative-sequence-indices-in-python</link>
<category>Python</category>
</item>

<item>
<title>Python Streams vs Unix Pipes</title>
<description>&lt;p&gt;I chanced upon an interesting puzzle:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Find the smallest number that can be expressed as the sum of 5, 17, 563, 641 consecutive prime numbers, and is itself a prime number.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;img src=&quot;https://wordaligned.org/images/primes.png&quot; alt=&quot;Small primes graphic&quot;/&gt;&lt;/p&gt;
&lt;p&gt;Here, the prime numbers are an infinite steam:&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;2, 3, 5, 7, 11, 13 ...

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;and sums of N consecutive primes are similarly infinite. For example, the sum of 2 consecutive primes would be the stream:&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;2+3, 3+5, 5+7, 7+11, 11+13 ...

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;which is:&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;5, 8, 12, 18, 24 ...

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;and the sum of 3 consecutive primes is:&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;10 (=2+3+5), 15, 23, 31 ...

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Had we been asked to find the smallest number which can be expressed as the sum of 3 consecutive primes and as the sum of 5 consecutive primes and is itself prime, the answer would be &lt;code&gt;83&lt;/code&gt;.&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;&amp;gt;&amp;gt;&amp;gt; 23 + 29 + 31
83
&amp;gt;&amp;gt;&amp;gt; 11 + 13 + 17 + 19 + 23
83

&lt;/pre&gt;

&lt;/div&gt;

&lt;h3 id=&quot;infinite-series-and-python&quot;&gt;Infinite series and Python&lt;/h3&gt;
&lt;p&gt;My first thought was to tackle this puzzle using Python iterators and generators. Here&amp;#8217;s the outline of a strategy:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;starting with a stream of primes&lt;/li&gt;
&lt;li&gt;tee the stream to create 4 additional copies&lt;/li&gt;
&lt;li&gt;transform these copies into the consecutive sums of 5, 17, 563 and 641 primes&lt;/li&gt;
&lt;li&gt;now merge these consecutive sums back with the original primes stream&lt;/li&gt;
&lt;li&gt;group the elements of this merged stream by value&lt;/li&gt;
&lt;li&gt;the first group which contains 5 elements must have occurred in every source, and is therefore a prime and representable as the consecutive sum of 5, 17, 563 and 641 primes&lt;/li&gt;
&lt;li&gt;which solves the puzzle!&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Note that when we copy an infinite stream we cannot consume it first. We will have to be lazy or we&amp;#8217;ll get exhausted.&lt;/p&gt;
&lt;p&gt;Courtesy of the Python Cookbook, I already had a couple of &lt;a href=&quot;http://www.onlamp.com/pub/a/python/excerpt/pythonckbk_chap1/index1.html?page=2&quot;&gt;useful&lt;/a&gt; &lt;a href=&quot;http://code.activestate.com/recipes/491285-iterator-merge/&quot;&gt;recipes&lt;/a&gt; to help implement this strategy:&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;def primes():
    &amp;#x27;&amp;#x27;&amp;#x27;Generate the sequence of prime numbers: 2, 3, 5 ... &amp;#x27;&amp;#x27;&amp;#x27;
    ....

def stream_merge(*ss):
    &amp;#x27;&amp;#x27;&amp;#x27;Merge a collection of sorted streams.

    Example: merge multiples of 2, 3, 5
    &amp;gt;&amp;gt;&amp;gt; from itertools import count, islice
    &amp;gt;&amp;gt;&amp;gt; def multiples(x): return (x * n for n in count(1))
    &amp;gt;&amp;gt;&amp;gt; s = stream_merge(multiples(2), multiples(3), multiples(5))
    &amp;gt;&amp;gt;&amp;gt; list(islice(s, 10))
    [2, 3, 4, 5, 6, 6, 8, 9, 10, 10]
    &amp;#x27;&amp;#x27;&amp;#x27;
    ....

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Both these functions merit a closer look for the cunning use they make of standard containers, but we&amp;#8217;ll defer this inspection until later. In passing, note that &lt;code&gt;stream_merge()&lt;/code&gt;&amp;#8217;s docstring suggests we might try using it as basis for &lt;code&gt;primes()&lt;/code&gt;:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;form the series of composite (non-prime) numbers by merging the streams formed by multiples of prime numbers; &lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;the primes remain when you remove these composites from the series of natural numbers.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;This scheme is hardly original &amp;#8212; it&amp;#8217;s a variant of &lt;a href=&quot;http://en.wikipedia.org/wiki/Sieve_of_Eratosthenes&quot;&gt;Eratosthenes&amp;#8217; sieve&lt;/a&gt; &amp;#8212; but if you look carefully you&amp;#8217;ll notice the self-reference. Unfortunately recursive definitions of infinite series don&amp;#8217;t work well with Python&lt;a id=&quot;fn1link&quot; href=&quot;https://wordaligned.org/articles/python-streams-vs-unix-pipes#fn1&quot;&gt;&lt;sup&gt;&lt;a href=&quot;https://blog.codeship.com/building-minimal-docker-containers-for-go-applications/&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/a&gt;, hence &lt;code&gt;primes()&lt;/code&gt; requires a little more finesse. We&amp;#8217;ll take a look at it later.&lt;/p&gt;
&lt;p&gt;Moving on, to solve the original puzzle we need a consecutive sum filter. This will transform a stream of numbers into a stream of consecutive sums of these numbers:&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;def consecutive_sum(s, n):
    &amp;#x27;&amp;#x27;&amp;#x27;Generate the series of sums of n consecutive elements of s

    Example: 0, 1, 2, 3, 4 ... =&amp;gt; 0+1, 1+2, 2+3, 3+4, ...
    &amp;gt;&amp;gt;&amp;gt; from itertools import count, islice
    &amp;gt;&amp;gt;&amp;gt; list(islice(consecutive_sum(count(), 2), 10))
    [1, 3, 5, 7, 9, 11, 13, 15, 17, 19]
    &amp;#x27;&amp;#x27;&amp;#x27;
    lo, hi = itertools.tee(s)
    csum = sum(next(hi) for _ in range(n))
    while True:
        yield csum
        csum += next(hi) - next(lo)

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Here we can think of the summed elements as lying within a sliding window: each time we slide the window an element gets added to the top and an element gets removed from the bottom, and we adjust &lt;code&gt;csum&lt;/code&gt; accordingly.&lt;/p&gt;
&lt;p&gt;So, now we have:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;the series of prime numbers, &lt;code&gt;primes()&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;a &lt;code&gt;stream_merge()&lt;/code&gt; connector&lt;/li&gt;
&lt;li&gt;a &lt;code&gt;consecutive_sum()&lt;/code&gt; filter&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The remaining stream adaptors come from the standard &lt;a href=&quot;http://docs.python.org/lib/itertools-functions.html&quot;&gt;itertools module&lt;/a&gt;. Note that the &lt;code&gt;stream_merge()&lt;/code&gt; works here since all the consecutive sum series are strictly increasing. Note also that the stream of prime numbers can be treated as &lt;code&gt;consecutive_sum(s=primes(), n=1)&lt;/code&gt;, handling the &amp;#8220;and is itself a prime number&amp;#8221; requirement.&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;&amp;gt;&amp;gt;&amp;gt; lens = 1, 5, 17, 563, 641
&amp;gt;&amp;gt;&amp;gt; N = len(lens)
&amp;gt;&amp;gt;&amp;gt; from itertools import tee, groupby
&amp;gt;&amp;gt;&amp;gt; ps = tee(primes(), N)
&amp;gt;&amp;gt;&amp;gt; csums = [consecutive_sum(p, n) for p, n in zip(ps, lens)]
&amp;gt;&amp;gt;&amp;gt; solns = (n for n, g in groupby(stream_merge(*csums)) 
             if len(list(g)) == N)

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Here, &lt;code&gt;solns&lt;/code&gt; is yet another stream, the result of merging the &lt;code&gt;N&lt;/code&gt; input consecutive sum streams then filtering out the numbers which appear &lt;code&gt;N&lt;/code&gt; times; that is, the numbers which can be expressed as sums of 1, 5, 17, 563 and 641 consecutive primes.&lt;/p&gt;
&lt;p&gt;The first such number solves the original puzzle.&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;&amp;gt;&amp;gt;&amp;gt; next(solns)
7002221

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Here&amp;#8217;s a picture of how these stream tools link up to solve this particular puzzle. The great thing is that we can reconnect these same tools to solve a wide range of puzzles, and indeed more practical &lt;a href=&quot;http://www.dabeaz.com/generators/&quot;&gt;processing tasks&lt;/a&gt;. To use the common analogy, we direct data streams along pipes.&lt;/p&gt;
&lt;p&gt;&lt;img alt=&quot;Stream connections&quot; src=&quot;https://wordaligned.org/images/pipeline.png&quot;/&gt;&lt;/p&gt;
&lt;h3 id=&quot;infinite-series-in-other-languages&quot;&gt;Infinite series in Other Languages&lt;/h3&gt;
&lt;p&gt;Python is the language I find most convenient most of the time, which explains why I reached for it first. It&amp;#8217;s an increasingly popular language, which helps explain why I didn&amp;#8217;t need to write the tricky parts of my solution from scratch: they&amp;#8217;d already been done. Python is also a language which makes compromises. Having used Python to find a solution to the puzzle I wondered if there wasn&amp;#8217;t some other language better suited to this kind of problem.&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;http://haskell.org&quot;&gt;Haskell&lt;/a&gt; makes no compromises when it comes to functional programming. Its lazy evaluation and inductive recursion make it a perfect fit for this kind of puzzle &amp;#8212; but my  approach of teeing, filtering and merging made me consider the Unix Shell. Now, I use Bash every day and page through its manual at least once a week. Scripting appeals and I&amp;#8217;m comfortable at the command line. How hard could it be to solve this puzzle using Bash? After all, I already knew the answer!&lt;/p&gt;
&lt;h3 id=&quot;partial-sums&quot;&gt;Partial sums.&lt;/h3&gt;
&lt;p&gt;Here&amp;#8217;s a simple shell function to generate partial sums. I&amp;#8217;ve used &lt;code&gt;awk&lt;/code&gt;, a little language I gave up on a long time ago in favour of more rounded scripting languages like Perl and then Python. Now I look at it again, it seems to fill a useful gap. Awk processes a file sequentially, applying pattern-action rules to each line, a processing template which I&amp;#8217;ve reinvented less cleanly many times. Despite my rediscovery of &lt;code&gt;awk&lt;/code&gt;, I&amp;#8217;ll be keeping its use strongly in check in what follows.&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;$ psum() { awk &amp;#x27;{ print s += $1 }&amp;#x27;; }

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Much like Perl, &lt;code&gt;awk&lt;/code&gt; guesses what you want to do. Here, it conjures the summation variable, &lt;code&gt;s&lt;/code&gt;, into existence, assigning it a default initial value of 0. (Good guess!) Since we&amp;#8217;re doing arithmetic &lt;code&gt;awk&lt;/code&gt; converts the first field of each input line into a number. We can test &lt;code&gt;psum&lt;/code&gt; by using &lt;code&gt;jot&lt;/code&gt; to generate the sequence 1, 2, 3, 4, 5 (this is on a Mac &amp;#8212; on a Linux platform use &lt;code&gt;seq&lt;/code&gt; instead of &lt;code&gt;jot&lt;/code&gt;).&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;$ jot 5 | psum
1
3
6
10
15

&lt;/pre&gt;

&lt;/div&gt;

&lt;h3 id=&quot;consecutive-sums&quot;&gt;Consecutive sums&lt;/h3&gt;
&lt;p&gt;You may be wondering why we&amp;#8217;ve bothered creating this partial sum filter since it&amp;#8217;s the sums of consecutive elements we&amp;#8217;re after, rather than the sum of the series so far. Well, notice that if &lt;code&gt;P[i]&lt;/code&gt; and &lt;code&gt;P[i+n]&lt;/code&gt; are two elements from the series of partial sums of S, then their difference, &lt;code&gt;P[i+n] - P[i]&lt;/code&gt;, is the sum of the &lt;code&gt;n&lt;/code&gt; consecutive elements from S.&lt;/p&gt;
&lt;p&gt;So to form an n-element consecutive sum series we can tee the partial sums streams, advance one of these by n, then zip through them in parallel finding their differences. An example makes things clear:&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;$ mkfifo pipe
$ jot 5 | psum | tee pipe | tail -n +2 | paste - pipe
3       1
6       3
10      6
15      10
        15

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Here, &lt;code&gt;jot 5&lt;/code&gt; generates the sequence 1, 2, 3, 4, 5, which &lt;code&gt;psum&lt;/code&gt; progressively accumulates to 1, 3, 6, 10, 15. We then &lt;code&gt;tee&lt;/code&gt; this partial sum series through two pipes: the first, &lt;code&gt;pipe&lt;/code&gt;, is an explicitly created named pipe created by &lt;code&gt;mkfifo&lt;/code&gt;, the second is implicitly created by the pipeline operator, &lt;code&gt;|&lt;/code&gt;. The remainder of the command line delays one series by one (note that &lt;code&gt;tail&lt;/code&gt; numbers lines from &lt;code&gt;1&lt;/code&gt;, not &lt;code&gt;0&lt;/code&gt;, so &lt;code&gt;tail -n +1&lt;/code&gt; is the identity filter) then pastes the two series back together&lt;a id=&quot;fn2link&quot; href=&quot;https://wordaligned.org/articles/python-streams-vs-unix-pipes#fn2&quot;&gt;&lt;sup&gt;&lt;a href=&quot;http://devcenter.wercker.com/docs/quickstarts/advanced/building-minimal-containers-with-go&quot;&gt;2&lt;/a&gt;&lt;/sup&gt;&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;By appending a single &lt;code&gt;awk&lt;/code&gt; action to the pipeline we get a consecutive sum series.&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;$ jot 5 | psum | tee pipe | tail -n +2 | paste - pipe | awk &amp;#x27;{print $1 - $2}&amp;#x27;
2
3
4
5
15

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;The output 2, 3, 4, 5 is the series of consecutive sums of length 1 taken from the original series 1, 2, 3, 4, 5. The trailing 15 and the 1 missed from the start are edge case problems, and easily corrected.&lt;/p&gt;
&lt;p&gt;Accumulating an increasing series of numbers in order to find the differences between elements lying a given distance apart on this series isn&amp;#8217;t a very smart idea on a computer with a fixed word-size, but it&amp;#8217;s good to know (e.g.) that &lt;code&gt;awk&lt;/code&gt; doesn&amp;#8217;t stop counting at 32 bits.&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;$ let &quot;N=1&amp;lt;&amp;lt;32&quot; &amp;amp;&amp;amp; echo $N | tee &amp;gt;(awk &amp;#x27;{print $1 * $1}&amp;#x27;)
4294967296
18446744073709551616

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Exactly if and when awk stops counting, I&amp;#8217;m not sure. The documentation doesn&amp;#8217;t say and I haven&amp;#8217;t looked at the source code.&lt;/p&gt;
&lt;h3 id=&quot;bug-fixes&quot;&gt;Bug Fixes&lt;/h3&gt;
&lt;p&gt;Let&amp;#8217;s capture these tiny functions and name them. Here, then, are revised &lt;code&gt;psum()&lt;/code&gt; and &lt;code&gt;sdiff()&lt;/code&gt; filters. The edge case problems should now be fixed.&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;$ psum()  { awk &amp;#x27;BEGIN { print 0 }{print s += $1 }&amp;#x27;; }
$ delay() { let &quot;n = $1 + 1&quot; &amp;amp;&amp;amp; tail +$n; } 
$ sdiff() { mkfifo p.$1 &amp;amp;&amp;amp; tee p.$1 | delay $1 | paste - p.$1 | \
            awk &amp;#x27;NF == 2 {print $1 - $2 }&amp;#x27;; }

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;A quick test:&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;$ jot 5 | psum | sdiff 3
6
9
12

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;The output is, as expected, the series of sums of consecutive triples taken from 1, 2, 3, 4, 5 (6=1+2+3, 9=2+3+4, 12=3+4+5).&lt;/p&gt;
&lt;p&gt;There&amp;#8217;s a pernicious bug, though. These functions can&amp;#8217;t handle infinite series so they are of limited use as pipeline tools. For example, if we stream in the series 0, 1, 2, &amp;#8230; (generated here as the partial sums of the series 1, 1, 1, &amp;#8230;) nothing gets output and we have to interrupt the process.&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;# This command appears to hang
$ yes 1 | psum | sdiff 1
^C

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;To work around this is, we can use Gnu &lt;code&gt;stdbuf&lt;/code&gt; to prohibit &lt;code&gt;tail&lt;/code&gt; and &lt;code&gt;paste&lt;/code&gt; from using output buffers.&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;$ psum()  { awk &amp;#x27;BEGIN { print 0 }{print s += $1 }&amp;#x27;; }
$ delay() { let &quot;n = $1 + 1&quot; &amp;amp;&amp;amp; stdbuf -o 0 tail +$n; } 
$ sdiff() { mkfifo p.$1 &amp;amp;&amp;amp; tee p.$1 | delay $1 | \
            stdbuf -o 0 paste - p.$1 | \
            awk &amp;#x27;NF == 2 {print $1 - $2 }&amp;#x27;; }

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Now the data flows again:&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;# Accumulate the stream 1 1 1 ...
# and print the difference between successive elements
$ yes 1 | psum | sdiff 1
1
1
1
1
^C

&lt;/pre&gt;

&lt;/div&gt;

&lt;h3 id=&quot;merging-streams&quot;&gt;Merging Streams&lt;/h3&gt;
&lt;p&gt;The Unix shell merges streams rather more succinctly than Python. &lt;code&gt;Sort -m&lt;/code&gt; does the job directly. Note that a standard &lt;code&gt;sort&lt;/code&gt; cannot yield any output until all its inputs are exhausted, since the final input item might turn out to be the one which should appear first in the output. Merge sort, &lt;code&gt;sort -m&lt;/code&gt;, can and does produce output without delay&lt;a id=&quot;fn3link&quot; href=&quot;https://wordaligned.org/articles/python-streams-vs-unix-pipes#fn3&quot;&gt;&lt;sup&gt;[3]&lt;/sup&gt;&lt;/a&gt;.&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;$ yes | sort
^C
$ yes | sort -m
y
y
y
y
y
^C

&lt;/pre&gt;

&lt;/div&gt;

&lt;h3 id=&quot;generating-primes&quot;&gt;Generating Primes&lt;/h3&gt;
&lt;p&gt;No doubt it&amp;#8217;s possible to generate the infinite series of prime numbers using native Bash code, but I chose to reuse the &lt;a href=&quot;http://www.onlamp.com/pub/a/python/excerpt/pythonckbk_chap1/index1.html?page=2&quot;&gt;Python Cookbook recipe&lt;/a&gt; for the job.&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;&lt;div class=&quot;codetitle&quot;&gt;primes&lt;/div&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;#!/usr/bin/env python
import itertools

def primes():
    &amp;#x27;&amp;#x27;&amp;#x27;Generate the prime number series: 2, 3, 5 ... &amp;#x27;&amp;#x27;&amp;#x27;
    D = {}
    for n in itertools.count(2):
        p = D.pop(n, None)
        if p is None:
            yield n
            D[n * n] = n
        else:
            x = n + p
            while x in D:
                x += p
            D[x] = p

for p in primes():
    print(p)

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;This is a subtle little program which makes clever use of Python&amp;#8217;s native hashed array container, the dictionary. In this case dictionary values are the primes less than &lt;code&gt;n&lt;/code&gt; and the keys are composite multiples of these primes. The loop invariant, roughly speaking, is that the dictionary values are the primes less than &lt;code&gt;n&lt;/code&gt;, and the corresponding keys are the lowest multiples of these primes greater than or equal to &lt;code&gt;n&lt;/code&gt;. It&amp;#8217;s a lazy, recursion-free take of Eratosthenes&amp;#8217; sieve.&lt;/p&gt;
&lt;p&gt;For the purposes of this article the important things about this program are:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;it generates an infinite series of numbers to standard output&lt;a id=&quot;fn4link&quot; href=&quot;https://wordaligned.org/articles/python-streams-vs-unix-pipes#fn4&quot;&gt;&lt;sup&gt;[4]&lt;/sup&gt;&lt;/a&gt;, making it a good source for a shell pipeline&lt;/li&gt;
&lt;li&gt;by making it executable and adding the usual shebang incantation, we can invoke this Python program seamlessly from the shell.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id=&quot;pipe-connection&quot;&gt;Pipe Connection&lt;/h3&gt;
&lt;p&gt;Recall the original puzzle:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Find the smallest number that can be expressed as the sum of 5, 17, 563, 641 consecutive prime numbers, and is itself a prime number.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;First, let&amp;#8217;s check the connections by solving a simpler problem which we can manually verify: to find prime numbers which are also the sum of 2 consecutive primes. As we noted before, this is the same as finding primes numbers which are the consecutive sums of 1 and 2 primes.&lt;/p&gt;
&lt;p&gt;In one shell window we create a couple of named pipes, &lt;code&gt;c.1&lt;/code&gt; and &lt;code&gt;c.2&lt;/code&gt;, which we&amp;#8217;ll use to stream the consecutive sum series of 1 and 2 primes respectively. The results series comprises the duplicates when we merge these pipes.&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;&lt;div class=&quot;codetitle&quot;&gt;Shell 1&lt;/div&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;$ mkfifo c.{1,2}
$ sort -mn c.{1,2} | uniq -d

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;In another shell window, stream data into c.1 and c.2:&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;&lt;div class=&quot;codetitle&quot;&gt;Shell 2&lt;/div&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;$ for i in 1 2; do (primes | psum | sdiff $i &amp;gt; c.$i) &amp;amp; done

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;In the first window we see the single number &lt;code&gt;5&lt;/code&gt;, which is the first and only prime number equal to the sum of two consecutive primes.&lt;/p&gt;
&lt;p&gt;Prime numbers equal to the sum of three consecutive primes are more interesting. In each shell window recall the previous commands and switch the 2s to 3s (a simple command history recall and edit, &lt;code&gt;^2^3^&lt;/code&gt;, does the trick). The merged output now looks like this:&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;$ sort -mn c.1 c.3 | uniq -d
23
31
41
...

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;We can check the first few values:&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;23 = 5 + 7 + 11
31 = 7 + 11 + 13
41 = 11 + 13 + 17

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;At this point we&amp;#8217;re confident enough to give the actual puzzle a try. Start up the solutions stream.&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;$ mkfifo c.{1,5,17,563,641}
$ sort -mn c.{1,5,17,563,641} | uniq -c | grep &quot;5 &quot;

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Here, we use a standard &lt;a href=&quot;https://wordaligned.org/articles/shell-script-sets&quot;&gt;shell script set intersection&lt;/a&gt; recipe: &lt;code&gt;uniq -c&lt;/code&gt; groups and counts repeated elements, and the &lt;code&gt;grep&lt;/code&gt; pattern matches those numbers common to all five input streams.&lt;/p&gt;
&lt;p&gt;Now we can kick off the processes which will feed into the consecutive sum streams, which &lt;code&gt;sort&lt;/code&gt; is waiting on.&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;$ for i in 1 5 17 563 641; do (primes | psum | sdiff $i &amp;gt; c.$i) &amp;amp; done

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Sure enough, after about 15 seconds elapsed time&lt;a id=&quot;fn5link&quot; href=&quot;https://wordaligned.org/articles/python-streams-vs-unix-pipes#fn5&quot;&gt;&lt;sup&gt;[5]&lt;/sup&gt;&lt;/a&gt;, out pops the result:&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;$ sort -mn c.{1,5,17,563,641} | uniq -c | grep &quot;5 &quot;
    5 7002221

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;15 seconds seems an eternity for arithmetic on a modern computer (you could start up a word processor in less time!), and you might be inclined to blame the overhead of all those processes, files, large numbers, etc. In fact it took around 6 seconds for the Python program simply to generate prime numbers up to 7002221, and my pure Python solution ran in 9 seconds.&lt;/p&gt;
&lt;h3 id=&quot;portability&quot;&gt;Portability&lt;/h3&gt;
&lt;p&gt;One of the most convenient things about Python is its portability. I don&amp;#8217;t mean &amp;#8220;portable so long as you conform to the language standard&amp;#8221; or &amp;#8220;portable if you stick to a subset of the language&amp;#8221; &amp;#8212; I mean that a Python program works whatever platform I use without me having to worry about it.&lt;/p&gt;
&lt;p&gt;Non-portability put me off the Unix shell when I first encountered it: there seemed too many details, too many platform differences &amp;#8212; which shell are you using? which extensions? which implementation of the core utilities, etc, etc? Readily available and well-written documentation didn&amp;#8217;t help much here: generally I want the shell to just do what I mean, which is why I switched so happily to Perl when I discovered it.&lt;/p&gt;
&lt;p&gt;Since then this situation has, for me, improved in many ways. Non-Unix platforms are declining as are the different flavours of Unix. Bash seems to have become the standard shell of choice and Cygwin gets better all the time. GNU coreutils predominate. As a consequence I&amp;#8217;ve forgotten almost all the Perl I ever knew and am actively rediscovering the Unix shell.&lt;/p&gt;
&lt;p&gt;Writing this article, though, I was reminded of the platform dependent behaviour which used to discourage me. On a Linux platform close to hand I had to use &lt;code&gt;seq&lt;/code&gt; instead of &lt;code&gt;jot&lt;/code&gt;, and &lt;code&gt;awk&lt;/code&gt; formatted large integers in a scientific form with a loss of precision.&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;&lt;div class=&quot;codetitle&quot;&gt;Loss of precision&lt;/div&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;$ echo &amp;#x27;10000000001 0&amp;#x27; | awk &amp;#x27;{print $1 - $2}&amp;#x27;
1e+10

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;On OS X the same command outputs 10000000001. I couldn&amp;#8217;t tell you which implementation is more correct. The fix is to explicitly format these numbers as decimal integers, but the danger is that the shell silently swallows these discrepancies and you&amp;#8217;ve got a portability problem you don&amp;#8217;t even notice.&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;&lt;div class=&quot;codetitle&quot;&gt;Precision recovered&lt;/div&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;$ echo &amp;#x27;10000000001 0&amp;#x27; | awk &amp;#x27;{printf &quot;%d\n&quot;, $1 - $2}&amp;#x27;
10000000001

&lt;/pre&gt;

&lt;/div&gt;

&lt;h3 id=&quot;stream-merge&quot;&gt;Stream Merge&lt;/h3&gt;
&lt;p&gt;I mentioned &lt;code&gt;stream_merge()&lt;/code&gt; at the start of this article, a general purpose function written by Raymond Hettinger which I originally found in the Python Cookbook. As with the prime number generator, you might imagine the merge algorithm to be recursively defined:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;to merge a pair of streams, take items from the first which are less than the head of the second, then swap;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;to merge N streams, merge the first stream with the merged (N-1) rest.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Again the Python solution does it differently, this time using a heap as a priority queue of items from the input streams. It&amp;#8217;s ingenious and efficient. Look how easy it is in Python to shunt functions and pairs in and out of queues.&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;from heapq import heapify, heappop, heapreplace

def stream_merge(*ss):
    &amp;#x27;&amp;#x27;&amp;#x27;Merge a collection of sorted streams.&amp;#x27;&amp;#x27;&amp;#x27;
    pqueue = []
    for i in map(iter, ss):
        try:
            pqueue.append((i.next(), i.next))
        except StopIteration:
            pass
    heapify(pqueue)
    while pqueue:
        val, it = pqueue[0]
        yield val
        try:
            heapreplace(pqueue, (it(), it))
        except StopIteration:
            heappop(pqueue)

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;A more sophisticated version of this code has made it into the Python standard library, where it goes by the name of &lt;a href=&quot;http://docs.python.org/dev/library/heapq.html#heapq.merge&quot;&gt;heapq.merge&lt;/a&gt; (I wonder why it wasn&amp;#8217;t filed in &lt;a href=&quot;http://docs.python.org/lib/itertools-functions.html&quot;&gt;itertools&lt;/a&gt;?)&lt;/p&gt;
&lt;h3 id=&quot;alternative-solutions&quot;&gt;Alternative Solutions&lt;/h3&gt;
&lt;p&gt;As usual Haskell wins the elegance award, so I&amp;#8217;ll quote in full a solution built without resorting to cookbookery which produces the (correct!) answer in 20 seconds.&lt;/p&gt;
&lt;div class=&quot;typocode&quot;&gt;

&lt;pre class=&quot;prettyprint&quot;&gt;module Main where

import List

isPrime x = all (\i -&amp;gt; 0/=x`mod`i) $ takeWhile (\i -&amp;gt; i*i &amp;lt;= x) primes

primes = 2:filter (\x -&amp;gt; isPrime x) [3..]

cplist n = map (sum . take n) (tails primes)

meet (x:xs) (y:ys) | x &amp;lt; y = meet xs (y:ys)
                   | y &amp;lt; x = meet (x:xs) ys
                   | x == y =  x:meet xs ys

main = print $ head $ \
(primes `meet` cplist 5) `meet` (cplist 17 `meet` cplist 563) `meet` cplist 641

&lt;/pre&gt;

&lt;/div&gt;

&lt;hr /&gt;
&lt;p&gt;&lt;a id=&quot;fn1&quot; href=&quot;https://wordaligned.org/articles/python-streams-vs-unix-pipes#fn1link&quot;&gt;&lt;a href=&quot;https://blog.codeship.com/building-minimal-docker-containers-for-go-applications/&quot;&gt;1&lt;/a&gt;&lt;/a&gt; CPython, more precisely &amp;#8212; I don&amp;#8217;t think anything in the Python language itself prohibits tail recursion. Even using CPython, yet another &lt;a href=&quot;http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/496691&quot;&gt;recipe&lt;/a&gt; from the online Python Cookbook explores the idea of an &lt;code&gt;@tail_recursion&lt;/code&gt; decorator.&lt;/p&gt;
&lt;p&gt;&lt;a id=&quot;fn2&quot; href=&quot;https://wordaligned.org/articles/python-streams-vs-unix-pipes#fn2link&quot;&gt;&lt;a href=&quot;http://devcenter.wercker.com/docs/quickstarts/advanced/building-minimal-containers-with-go&quot;&gt;2&lt;/a&gt;&lt;/a&gt; &lt;code&gt;Tail&lt;/code&gt; is more commonly used to yield a fixed number of lines from the end of the file: by prefixing the line count argument with a &lt;code&gt;+&lt;/code&gt; sign, it skips lines from the head of the file. The GNU version of &lt;code&gt;head&lt;/code&gt; can similarly be used with a &lt;code&gt;-&lt;/code&gt; prefix to skip lines at the tail of a file. The notation is {compact,powerful,subtle,implementation dependent}.&lt;/p&gt;
&lt;p&gt;&lt;a id=&quot;fn3&quot; href=&quot;https://wordaligned.org/articles/python-streams-vs-unix-pipes#fn3link&quot;&gt;[3]&lt;/a&gt; &lt;code&gt;Sort -m&lt;/code&gt; is a sort which doesn&amp;#8217;t really sort &amp;#8212; its inputs should already be sorted &amp;#8212; rather like the &lt;code&gt;+n&lt;/code&gt; option turning &lt;code&gt;tail&lt;/code&gt; on its head.&lt;/p&gt;
&lt;p&gt;&lt;a id=&quot;fn4&quot; href=&quot;https://wordaligned.org/articles/python-streams-vs-unix-pipes#fn4link&quot;&gt;[4]&lt;/a&gt; The series is infinite in theory only: at time &lt;code&gt;n&lt;/code&gt; the number of items in the &lt;code&gt;has_prime_factors&lt;/code&gt; dictionary equals the number of primes less than &lt;code&gt;n&lt;/code&gt;, and each key in this dictionary is larger than &lt;code&gt;n&lt;/code&gt;. So resource use increases steadily as &lt;code&gt;n&lt;/code&gt; increases.&lt;/p&gt;
&lt;p&gt;&lt;a id=&quot;fn5&quot; href=&quot;https://wordaligned.org/articles/python-streams-vs-unix-pipes#fn5link&quot;&gt;[5]&lt;/a&gt; I used a MacBook laptop used to run these scripts. &lt;/p&gt;
&lt;pre&gt;
  Model Name:               MacBook
  Model Identifier:         MacBook1,1
  Processor Name:           Intel Core Duo
  Processor Speed:          2 GHz
  Number Of Processors:     1
  Total Number Of Cores:    2
  L2 Cache (per processor): 2 MB
  Memory:                   2 GB
  Bus Speed:                667 MHz
&lt;/pre&gt;</description>
<dc:date>2016-07-28</dc:date>
<guid>https://wordaligned.org/articles/python-streams-vs-unix-pipes</guid>
<author>tag@wordaligned.org (Thomas Guest)</author>
<link>https://wordaligned.org/articles/python-streams-vs-unix-pipes</link>
<category>Python</category>
</item>

<item>
<title>8 Queens Puzzle</title>
<description>&lt;p style=&quot;font-size:4em&quot;&gt;&amp;#9819;&amp;#9819;&amp;#9819;&amp;#9819;&amp;#9819;&amp;#9819;&amp;#9819;&amp;#9819;&lt;/p&gt;

&lt;p&gt;Here&amp;#8217;s one of my favourite &lt;a href=&quot;http://code.activestate.com/recipes/576647-eight-queens-six-lines&quot;&gt;recipes, by Raymond Hettinger&lt;/a&gt;, lightly adapted for Python 3.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;from itertools import permutations

n = width_of_chessboard = 8
sqs = range(n)

Qs = (Q for Q in permutations(sqs)
      if n == len({Q[i]+i for i in sqs})
           == len({Q[i]-i for i in sqs}))
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;We start by assigning &lt;code&gt;sqs&lt;/code&gt; to the range 0 through 7.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;&amp;gt;&amp;gt;&amp;gt; sqs = range(8)
&amp;gt;&amp;gt;&amp;gt; list(sqs)
[0, 1, 2, 3, 4, 5, 6, 7]
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The range has 8 indices. If each index represents a column on a standard 8x8 chessboard and the value at that index represents a row on the same chessboard, then our range represents 8 positions on the board. Using the built-in &lt;a href=&quot;https://docs.python.org/3/library/functions.html#enumerate&quot;&gt;enumerate&lt;/a&gt; function to generate these &lt;code&gt;(index, value)&lt;/code&gt; pairs we see that &lt;code&gt;sqs&lt;/code&gt; encodes the diagonal &lt;code&gt;(0, 0)&lt;/code&gt; to &lt;code&gt;(7, 7)&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;&amp;gt;&amp;gt;&amp;gt; list(enumerate(sqs))
[(0, 0), (1, 1), (2, 2), (3, 3), (4, 4), (5, 5), (6, 6), (7, 7)]
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Next, permute the values &amp;#8212; the rows.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;&amp;gt;&amp;gt;&amp;gt; from itertools import permutations
&amp;gt;&amp;gt;&amp;gt; rooks = permutations(sqs)
&amp;gt;&amp;gt;&amp;gt; next(rooks)
(0, 1, 2, 3, 4, 5, 6, 7)
&amp;gt;&amp;gt;&amp;gt; next(rooks)
(0, 1, 2, 3, 4, 5, 7, 6)
&amp;gt;&amp;gt;&amp;gt; next(rooks)
(0, 1, 2, 3, 4, 6, 5, 7)
&amp;gt;&amp;gt;&amp;gt; list(rooks)[34567]
(6, 7, 0, 1, 3, 4, 5, 2)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;a href=&quot;https://docs.python.org/3/library/itertools.html#itertools.permutations&quot;&gt;Itertools.permutations&lt;/a&gt; generates values lazily. The snippet above shows the first two results, then skips forward 34568 places. &lt;code&gt;Permutations(sqs)&lt;/code&gt; generates all possible arrangements of 8 pieces on a chessboard such that each row has exactly one piece on it and so does each column. That is, it generates all possible ways of placing 8 &lt;a href=&quot;http://mathworld.wolfram.com/RooksProblem.html&quot;&gt;rooks on a chessboard&lt;/a&gt; so that no pair attacks each other.&lt;/p&gt;
&lt;p&gt;In the final program, we filter these rook positions to generate solutions to the more challenging &amp;#8212; and more interesting &amp;#8212; &lt;a href=&quot;https://en.wikipedia.org/wiki/Eight_queens_puzzle&quot;&gt;eight Queens puzzle&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Consider our starting point, the diagonal &lt;code&gt;(0, 0)&lt;/code&gt; to &lt;code&gt;(7, 7)&lt;/code&gt;&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;&amp;gt;&amp;gt;&amp;gt; diagonal = range(8)
&amp;gt;&amp;gt;&amp;gt; {r-c for c,r in enumerate(diagonal)}
{0}
&amp;gt;&amp;gt;&amp;gt; {r+c for c,r in enumerate(diagonal)}
{0, 2, 4, 6, 8, 10, 12, 14}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Here, a set comprehension collects the distinct values taken by the difference between the row and column along this diagonal, which in this case gives &lt;code&gt;{0}&lt;/code&gt;. That is, if we placed 8 bishops along this &amp;#x2197; diagonal they would all attack each other along this diagonal. The sum of the row and column takes 8 distinct values, however, meaning no pair attacks along a &amp;#x2196; diagonal.&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;https://docs.python.org/3/reference/expressions.html#comparisons&quot;&gt;Comparison operators chain in Python&lt;/a&gt;, so the expression:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;n == len({Q[i]+i for i in sqs}) == len({Q[i]-i for i in sqs})
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;is &lt;code&gt;True&lt;/code&gt; if both sets have 8 elements, that is, if the squares in &lt;code&gt;Q&lt;/code&gt; are on distinct &amp;#x2196; and &amp;#x2197; diagonals; or, equivalently no pair of bishops placed on the squares in &lt;code&gt;Q&lt;/code&gt; would attack each other. Since we already know &lt;code&gt;Q&lt;/code&gt; positions 8 rooks so that no pair attacks each other, and a chess Queen combines the moves of a rook and a bishop, we can see that &lt;code&gt;Qs&lt;/code&gt; generates every possible way of placing 8 Queens on a chessboard so that no pair attacks each other: which is to say, we&amp;#8217;ve solved the &lt;a href=&quot;https://en.wikipedia.org/wiki/Eight_queens_puzzle&quot;&gt;8 Queens puzzle&lt;/a&gt;.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;Qs = (Q for Q in permutations(sqs)
      if n == len({Q[i]+i for i in sqs})
           == len({Q[i]-i for i in sqs}))
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This is beautiful code and there&amp;#8217;s one final twist.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;Qs&lt;/code&gt; is a &lt;a href=&quot;https://docs.python.org/3/reference/expressions.html#generator-expressions&quot;&gt;generator expression&lt;/a&gt; primed to permute squares into neighbourly rooks filtered by amicable bishops yielding unthreatening Queens. Until asked, however, it does nothing.&lt;/p&gt;
&lt;p style=&quot;font-size:4em&quot;&gt;&amp;#9813;&amp;#9813;&amp;#9813;&amp;#9813;&amp;#9813;&amp;#9813;&amp;#9813;&amp;#9813;&lt;/p&gt;</description>
<dc:date>2016-04-04</dc:date>
<guid>https://wordaligned.org/articles/8-queens-puzzle</guid>
<author>tag@wordaligned.org (Thomas Guest)</author>
<link>https://wordaligned.org/articles/8-queens-puzzle</link>
<category>Python</category>
</item>

<item>
<title>Easy as Py</title>
<description>&lt;h2 id=&quot;what-makes-python-simple&quot;&gt;What makes Python Simple?&lt;/h2&gt;
&lt;p&gt;I consider Python a simple language. Here&amp;#8217;s why.&lt;/p&gt;
&lt;h2 id=&quot;easy-to-read&quot;&gt;Easy to Read&lt;/h2&gt;
&lt;p&gt;I can read and understand Python code (unless it&amp;#8217;s &lt;a href=&quot;https://benkurtovic.com/2014/06/01/obfuscating-hello-world.html&quot;&gt;wilfully perverse&lt;/a&gt;). Syntactic whitespace and the associated removal of punctuation results in a regular, open layout. The combination of built in containers, extensive standard libraries and high level constructs allow for clear, compact code: code which fits in your head.&lt;/p&gt;
&lt;h2 id=&quot;easy-to-write&quot;&gt;Easy to Write&lt;/h2&gt;
&lt;p&gt;I can write Python code which is free of syntax errors and which does what I want. Of course it helps that I&amp;#8217;ve been actively using the language for 15 years, but I&amp;#8217;ve been using C++ for longer and still make mistakes with it: ask me to declare a pointer to a member function, for example, or to knock up a variadic template function, and I&amp;#8217;ll need a moment or two.&lt;/p&gt;
&lt;h2 id=&quot;transparent&quot;&gt;Transparent&lt;/h2&gt;
&lt;p&gt;I also consider C a simple language. C offers a transparent abstraction of a register machine, with a stack, a heap, and addressable memory. If you can imagine the operation of such a machine, you can figure out C. Python is less transparent but reveals its workings if pressed. Dicts form a part of the language seen by users, and under the hood they provide the dynamic context which supports a running program. The &lt;a href=&quot;https://en.wikipedia.org/wiki/Read%E2%80%93eval%E2%80%93print_loop&quot;&gt;read-eval-print loop&lt;/a&gt; makes it easy to poke and reshape your program. You can &lt;a href=&quot;https://docs.python.org/3/library/dis.html&quot;&gt;disassemble code&lt;/a&gt; to see what the virtual machine sees.&lt;/p&gt;
&lt;h2 id=&quot;consistent-improvement&quot;&gt;Consistent improvement&lt;/h2&gt;
&lt;p&gt;The language has got better since I first started using it. It has also got bigger, and this growth would, at first, seem at odds with simplicity. However, consider &amp;#8212; as an example &amp;#8212; the point when list comprehensions were introduced. Language support for building a list from an iterable results in compact declarative code. Simple code. What&amp;#8217;s more, the square brackets which now delimit list comprehensions are the same square brackets that were previously used to delimit lists. The syntax may have been new but it didn&amp;#8217;t surprise. Now consider the introduction of set and dict comprehensions, which follow logically and naturally from list comprehensions, almost as if they were discovered rather than invented.&lt;/p&gt;
&lt;p&gt;There are many other examples where additions to the language have unified and simplified.&lt;/p&gt;
&lt;h2 id=&quot;vision&quot;&gt;Vision&lt;/h2&gt;
&lt;p&gt;I&amp;#8217;m not a Python insider and cannot comment on the exact balance of benevolence and dictatorship which goes into the &lt;a href=&quot;https://www.python.org/dev/peps/&quot;&gt;language enhancement process&lt;/a&gt;. I would say Python doesn&amp;#8217;t suffer from being designed by a committee. It sticks to its strengths and its direction, to its vision.&lt;/p&gt;</description>
<dc:date>2016-03-23</dc:date>
<guid>https://wordaligned.org/articles/easy-as-py</guid>
<author>tag@wordaligned.org (Thomas Guest)</author>
<link>https://wordaligned.org/articles/easy-as-py</link>
<category>Python</category>
</item>

<item>
<title>Sausages, sausages, sausages - slice, slice, slice</title>
<description>&lt;p&gt;A friend asked for help reaching the next level of a puzzle game. The test which stalled her involves machine placement in a sausage factory.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&amp;#8230; each sausage was branded with a letter for quality control purposes, thus:
&lt;strong&gt;ypbtkizfgxptclcoirdsuhjwulqkoszrabfc&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The string was then drawn through seven machines which rearranged the sausages in flavour enhancing ways.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Machine A: The Reversifier&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Reverses the order of the sausages, so they get tastier as you go along.&lt;/p&gt;
&lt;p&gt;&amp;#8230;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Machine G: Secondhalffirstifier&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;move the second half of the string to the beginning, as the earlier sausages are too spicy to eat early in the morning.&lt;/p&gt;
&lt;p&gt;He attached these machines in a certain sequence, though one of them was out for repair so only six were used. He then fed a string of sausages through and was surprised to discover the string that came out at the other end said &lt;strong&gt;lickyourlips&lt;/strong&gt;. What order were the machines in?&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;It&amp;#8217;s nicely phrased, but what&amp;#8217;s really wanted is the sequence of simple transformations that takes input &amp;#8220;ypbtkizfgxptclcoirdsuhjwulqkoszrabfc&amp;#8221; and produces output &amp;#8220;lickyourlips&amp;#8221;.&lt;/p&gt;
&lt;p&gt;It&amp;#8217;s no doubt possible to work backwards and figure out a solution using no more than logic, pencil and paper. For example, only two of the machines change the length of the string, and &amp;#8212; looking at the before and after lengths &amp;#8212; these must both be used. It&amp;#8217;s rather easier to write a short program to find a solution.&lt;/p&gt;
&lt;p&gt;First we must simulate the seven sausage machines, A-G, which perform the following sequence operations.&lt;/p&gt;
&lt;ol type=&quot;A&quot;&gt;
&lt;li&gt;reverse the order of a sequence&lt;/li&gt;
&lt;li&gt;remove every other element of a sequence&lt;/li&gt;
&lt;li&gt;remove every third element of a sequence&lt;/li&gt;
&lt;li&gt;pairwise reverse elements of a sequence&lt;/li&gt;
&lt;li&gt;move even numbered elements to the front of a sequence&lt;/li&gt;
&lt;li&gt;move the last element of a sequence to the front&lt;/li&gt;
&lt;li&gt;swap the front and back half of a sequence&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;None of these is difficult, especially in a high-level language which builds in support for sequence operations. What I found noteworthy is that a solution can be found without any &lt;a href=&quot;https://docs.python.org/3/reference/compound_stmts.html&quot;&gt;loops or if statements&lt;/a&gt;. What&amp;#8217;s more, every operation can handled using nothing more than &lt;a href=&quot;https://docs.python.org/3/library/stdtypes.html#sequence-types-list-tuple-range&quot;&gt;slice operations&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Here&amp;#8217;s my solution. The machines consist of slice operations, helped by a couple of conditional expressions and recursive calls. The solution can then be brute-forced: there are only 5040 ways of permuting 6 out of 7 machines.&lt;/p&gt;
&lt;script src=&quot;https://gist.github.com/wordaligned/a04308eac7ee2aa37e01.js&quot;&gt;&lt;/script&gt;

&lt;p&gt;I&amp;#8217;ve used &lt;code&gt;reduce&lt;/code&gt; to apply a chain of functions to a string of sausages &amp;#8212; an explicit loop might be clearer, but I want a loop-free solution. For this same reason I use recursion in the pairwise swapper and the element dropper. Generally in Python, recursion is a poor choice. In this case I know I&amp;#8217;m starting with a string of just 36 elements which cannot get any longer; there&amp;#8217;s no risk of exceeding the &lt;a href=&quot;https://docs.python.org/3/library/sys.html#sys.getrecursionlimit&quot;&gt;system recursion limit&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The sequence reversal &lt;code&gt;s[::-1]&lt;/code&gt; is idiomatic but alarming to the uninitiated. Slices have &lt;code&gt;[start:stop:stride]&lt;/code&gt; fields, any of which may be defaulted. Usually &lt;code&gt;start&lt;/code&gt; and &lt;code&gt;stop&lt;/code&gt; default to the start and end of the sequence, but in this case the negative stride reverses them.&lt;/p&gt;
&lt;p&gt;To rotate the last element of a sequence to the front, prefer:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;return s[-1:] + s[:-1]
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;to:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;return [s[-1]] + s[:-1]
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;because the latter raises an &lt;code&gt;IndexError&lt;/code&gt; for an empty sequence.&lt;/p&gt;
&lt;p&gt;Slicing is a formidable tool for sequence manipulation, especially when combined with the option of using negative indices to count back from the end. Slices allow you to reverse, rotate and partition sequences, to pairwise swap elements, and to drop every nth element.&lt;/p&gt;
&lt;p&gt;The miniature recipes presented here don&amp;#8217;t even use slice assignment, which gives me an excuse to reproduce this elegant prime sieve function, which does.&lt;/p&gt;
&lt;script src=&quot;https://gist.github.com/wordaligned/09c17eaabb6cd4c6bcfb.js&quot;&gt;&lt;/script&gt;</description>
<dc:date>2016-03-21</dc:date>
<guid>https://wordaligned.org/articles/sausages-slices</guid>
<author>tag@wordaligned.org (Thomas Guest)</author>
<link>https://wordaligned.org/articles/sausages-slices</link>
<category>Python</category>
</item>

<item>
<title>Sledgehammers vs Nut Crackers</title>
<description>&lt;h2 id=&quot;awk&quot;&gt;Awk&lt;/h2&gt;
&lt;p&gt;I get awk and can read awk programs but find the language of little use. Its focus is narrow and its syntax can be surprising. Python matches it at home and smashes it away. Nonetheless, a number of the &lt;a href=&quot;./advent-of-code&quot;&gt;advent of code&lt;/a&gt; puzzles fit the awk processing model &amp;#8212; line based instructions, the interpretation of which builds state contributing to the final result &amp;#8212; and I enjoyed writing awk solutions. There&amp;#8217;s satisfaction in using a tool which is up to the job, no more and no less: in using a nutcracker, rather than a sledgehammer, to crack a nut.&lt;/p&gt;
&lt;h2 id=&quot;puzzle&quot;&gt;Puzzle&lt;/h2&gt;
&lt;p&gt;For example, the puzzle set on &lt;a href=&quot;http://adventofcode.com/day/6&quot;&gt;day 6&lt;/a&gt; consists of a list of instructions for switching and toggling a 1000 x 1000 grid of lights. The instructions read:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;turn on 296,50 through 729,664
turn on 212,957 through 490,987
toggle 171,31 through 688,88
turn off 991,989 through 994,998
....
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;and the question is, after following these instructions, &lt;strong&gt;how many lights are lit&lt;/strong&gt;?&lt;/p&gt;
&lt;p&gt;Each instruction is a single line; the actions &amp;#8212; turn on, turn off, toggle &amp;#8212; can be matched by patterns; and to follow these actions requires no more than an array and arithmetic: &lt;code&gt;awk&lt;/code&gt; fits nicely.&lt;/p&gt;
&lt;h2 id=&quot;code&quot;&gt;Code&lt;/h2&gt;
&lt;script src=&quot;https://gist.github.com/wordaligned/ceb8671aad6a3416e094.js&quot; type=&quot;text/javascript&quot;&gt;&lt;/script&gt;

&lt;h2 id=&quot;comments&quot;&gt;Comments&lt;/h2&gt;
&lt;p&gt;Here, the &lt;code&gt;BEGIN&lt;/code&gt; action sets the field separator &lt;code&gt;FS&lt;/code&gt; to the regular expression &lt;code&gt;[ ,]&lt;/code&gt; which picks out the textual and numeric fields from each instruction. Awk is highly dynamic: use a variable as a number and it becomes a number, in this case the coordinates of a lighting grid; and similarly, the fields &amp;#8220;on&amp;#8221;, &amp;#8220;off&amp;#8221; and &amp;#8220;toggle&amp;#8221; are matched and treated as strings.&lt;/p&gt;
&lt;p&gt;The grid of lights appears to be represented as a two dimensional array, accessed &lt;code&gt;lights[x,y]&lt;/code&gt; rather than &lt;code&gt;lights[x][y]&lt;/code&gt;. In fact, the array &amp;#8212; like all arrays in awk &amp;#8212; is an associative container, which maps from strings &amp;#8212; not numbers &amp;#8212; to values.  The key &lt;code&gt;x,y&lt;/code&gt; becomes a string which joins the values of &lt;code&gt;x&lt;/code&gt; and &lt;code&gt;y&lt;/code&gt; with a separator defaulted to &lt;code&gt;&quot;\034&quot;&lt;/code&gt;.&lt;/p&gt;
&lt;h2 id=&quot;niggles&quot;&gt;Niggles&lt;/h2&gt;
&lt;p&gt;The escape character at the end of line 5 is needed to continue the long line. I&amp;#8217;d prefer to use parentheses to wrap the expression over more than one line, &lt;a href=&quot;http://pep8.org/#indentation&quot;&gt;as I would in Python&lt;/a&gt;, but this trick doesn&amp;#8217;t seem to work. I was somewhat surprised there was no built in &lt;code&gt;sum()&lt;/code&gt; function to count up the number of lights turned on by the &lt;code&gt;END&lt;/code&gt;. It would have been cute to pass &lt;code&gt;on()&lt;/code&gt;, &lt;code&gt;off()&lt;/code&gt; and &lt;code&gt;toggle()&lt;/code&gt; as functions into &lt;code&gt;switch()&lt;/code&gt;, separating traversal from action, but I couldn&amp;#8217;t find a way to do this in awk.&lt;/p&gt;
&lt;p&gt;My awk script solved the puzzle in 45 seconds. A Python solution took 17 seconds. I didn&amp;#8217;t try optimising either.&lt;/p&gt;
&lt;script src=&quot;https://gist.github.com/wordaligned/007c5cb30a3f7490e3ff.js&quot; type=&quot;text/javascript&quot;&gt;&lt;/script&gt;

&lt;h2 id=&quot;dont-use-a-sledgehammer-to-crack-a-nut&quot;&gt;Don&amp;#8217;t use a sledgehammer to crack a nut!&lt;/h2&gt;
&lt;p&gt;This advice, commonly given to programmers, demands explanation. If it&amp;#8217;s intended to imply a sledgehammer is more likely to pulverise the nut than open it, then fine, that&amp;#8217;s true &amp;#8212; but the analogy fails in this case: a solution written in Python would have been equally correct.&lt;/p&gt;
&lt;p&gt;Alternatively, if we mean you shouldn&amp;#8217;t use a powerful language when a less powerful one would do, then the question becomes: &lt;strong&gt;why not&lt;/strong&gt;? Python is a general purpose programming language. It can crack nuts, peel bananas, serve web pages and so much more. If you know Python why bother with Awk?&lt;/p&gt;
&lt;p&gt;At the outset of this post I admitted I don&amp;#8217;t generally bother with awk. Sometimes, though, I encounter the language and need to read and possibly adapt an existing script. So that&amp;#8217;s one reason to bother. Another reason is that it&amp;#8217;s elegant and compact. Studying its operation and motivation may help us compose and factor our own programs &amp;#8212; programs far more substantial than the scripts presented here, and in which there will surely be places for mini-languages of our own.&lt;/p&gt;</description>
<dc:date>2016-02-23</dc:date>
<guid>https://wordaligned.org/articles/sledgehammers-vs-nut-crackers</guid>
<author>tag@wordaligned.org (Thomas Guest)</author>
<link>https://wordaligned.org/articles/sledgehammers-vs-nut-crackers</link>
<category>Python</category>
</item>

</channel>
</rss>
