Programming and linguistics — Makers Week 1

For those who don’t already know (which is most people — I didn’t broadcast it widely), I have some personal tidings of a glad kind.

After investing several years and a postgraduate degree in becoming a professional classical musician — my first attempt to reboot myself after my linguistics undergrad — I am now officially, as of a week ago yesterday, not.

For the next 15 weeks (including four Pre-Course weeks at home, plus one for Christmas) I’m enrolled in Makers coding bootcamp: an intensive, full-time course which should help me to acquire the skills, technical, psychosocial and otherwise, to launch a new career as a software developer.

The change has been some time brewing. There are a lot of reasons for it: principally money, fun, and altruism (non-exclusive categories).

2018 has been an interesting year in the sense that I could stand at the January end of it without (on the one hand) very much clue what my life would look like by the December end, but feeling pretty certain (on the other hand) that I was going to have made sweeping changes. It’s been empowering actually to have followed through on that somewhat, and get as far as this starting block, having tried out a few other things along the way. I’m feeling curious and excited to see where programming takes me, not to mention exceedingly lucky to have the luxury of considering multiple career options.

What have I been up to so far? In Week the First: The En-Makers-ing, I’ve been getting the hang of Bash (a shell script for the command line, the direct way of giving a computer instructions), and the essentials of ‘version control’ with Git and Github. Then I got a start on Makers’ Ruby curriculum, which is consolidating the basic Ruby I did on Codecademy and CodeWars before the course. And I’m taking notes from a couple of books, The Well-Grounded Rubyist and Practical Object-Oriented Design in Ruby, to cover principles of good practice.

In Makers’ curriculum, something I’ve been particularly appreciating is the way that — wherever possible — the goal of learning has been backgrounded, and some other goal foregrounded (which matches up with Tip 8 from this review of studies on teaching programming). For example, after covering the basics of Bash and git, practice was instantiated as solving a murder mystery. The reason I like gamifying learning, or otherwise imbuing it with near-term goals, is that you can leave the mode of Trying To Study, which depends on constant application of willpower, and focus on something near-term so that progress on the overarching goal is implicit, unconscious, invisible and default. Nate Soares talks about this in this post which I highly recommend. One core analogy compares two strategies for getting fit: running up and down a field (you might slow down or give up) vs playing football (easy to keep going).

Taking the gamification of learning to its logical conclusion, I’ve started my coding journey just in time to leap aboard the Advent of Code train.

I’ve only just noticed what the acronym spells.

On one level it’s a series of 50 twice-daily coding challenges for a variety of skill levels. On another, it’s a Christmas-themed time-travel narrative with, so far, well-written flavour text, embedded in a community of puzzler-learners who are happy to help each other out. A very motivating design, and much more satisfying to work through than your traditional chocolate advent calendar (incidentally something else I finished in week 1, thanks Mum.)

inguistics I promised you, and linguistics shall you have. During this harrowing crisis of identity slash magical journey of self-discovery, I’ve wondered if there are any strands which tie together the artistic me who does music and the analytical me who enjoys making things with code. Some people reckon musical talent correlates with coding skill, but, while I agree that the Two Cultures have more in common than that which divides them, the explanatory analogies I’ve seen feel quite forced.

For me, the much stronger analogy is to the things that fascinated me in my linguistics undergrad. Several abstract concepts transfer so readily that programmers and linguists are using the very same vocabulary to refer to the very same things.

All humans, apart from aphasics, feral children, and the 45th President of the United States*, are natural geniuses at syntax. For all the archetypally human aspects of language use — its contextuality, winking subtexts, and emotional power — there are many elements of language that operate according to rules that can (in theory) be precisely specified. Such a system of rules, making the difference between French and Dyirbal (or between two dialects of the same language), is what linguists mean by ‘the grammar’.

*Actually, I was lying. Trump is like most humans in having more facility in the grammar of his native language than any currently existing Natural Language Processing program.

The vast majority of it goes on under the hood, totally hidden from conscious access, since the word ‘grammar’ has a much deeper meaning than that on about which English teachers, Telegraph letter writers, and people called Melvyn harp.

To linguists, ‘grammar’ means the stuff you mastered before you started pre-school — highly sophisticated mental programs for producing and interpreting your native language. It’s the reason native English-speakers automatically say ‘big brown paper bag’ and not ‘paper brown big bag’; and it’s why you say ‘in-bloody-credible’ and not ‘incre-bloody-dible’.

‘Grammar’ also happens to be etymological cousins with ‘glamour’, which used to mean ‘magic’. Read into that what you will. (Magic, I tell you.)

‘Syntax’ refers to the same kind of thing, whether instantiated in meat or in silicon minds. In both cases, we’re looking at rules that govern the surface representation of a deeper structure, and which describe the interface between the human and the program to be run / thought to be expressed.

Computer programs are typically written in source code, which is abstract enough for human programmers to work with. To run it, the computer must transform the source code into assembly code. If all goes to plan (if it doesn’t throw a syntax error) that in turn becomes machine code, bare 1s and 0s.

A parallel process is at work every time your language faculty interprets an utterance. As you read this sentence, your mind is building a syntactic representation of its underlying hierarchical structure, grouping words into phrases, and phrases into parent phrases, in order to work out how they all relate to each other and what the sentence means.

(How he managed to get in my pajamas, I’ll never know.)

Your language-decoder doesn’t wait until the sentence ends to begin interpreting, otherwise conversations would be much less efficient. Instead, it generates hypotheses about the structure of the sentence in real time. In computer science terms, it’s more an interpreter than a compiler.

Of course, sometimes it gets it wrong: either the speaker did something weird, or your prediction was badger.

Neuroscientists call that mental jolt you just felt a P600, and I think it’s not too much of a stretch to liken this to the ‘syntax errors’ thrown by sub-optimal programs.

Look: if brains had command lines, maybe they’d be like this.

Sentences that lead you up the garden path, like ‘The old man the boat’, are a window into the way natural language is processed by humans. I think there’s something comparable at work in operator precedence, which is the way a programming language decides to group operations before evaluating them. The main difference here is that there will always be only one possible reading of a line of code.

So in any pizzeria run by Ruby, ordering the “pizza with tomato and cheese or vegan cheese and olives” will get you either a margherita, or a tomato-less vegan pizza, unless you specify (cheese || vegan_cheese) should be evaluated first. Simply because && has higher precedence. Thank the Lord that, most of the time, humans have enough savvy and heuristics to get the intended precedence in natural language.

Enough syntax for now. For a bit of linguistics which does have uncancelable rule-ordering, let’s turn to Rule-Based Phonology. We can tell that the English pluralization rules insert vowel (epenthesis) and change plural morpheme from 'z' to 's' if preceded by a voiceless segment (devoicing) are applied in that order, because we say ‘wishes’ with a voiced <s>: we go wish + swishəswishəz(rhyming with ‘militias’), not wish + swishs wishəs (which would rhyme with ‘vicious’).

So we could say the English plural function calls the sub-operations in this order: stem.epenthesize.devoice

I’d estimate that there are hundreds of these kinds of morpho-/phonological and phonetic rules running in the average sentence, making an intriguingly intricate object of study.

And it’s quite amazing to consider that actual infants are toddling about their business, implicitly hypothesising, testing, and then applying rules as convoluted as “insert a voiceless stop homorganic with the place of the nasal into a syllable-final cluster composed of a nasal followed by a voiceless fricative” — all while mostly managing not to fall over. (That’s the stop-insertion rule which applies in words like ‘something’ and ‘prince’ to make them sound a bit like ‘sumpthing’ and ‘prints’.)

Both morphophonological and syntactic rules are often sensitive to the branching structure I mentioned above re elephant pajamas, which makes them behave in ways that are eerily analogous to various coding concepts.

One of these behaviours is recursion. It’s recursion that lets us build indefinitely long sentences out of a limited number of phrase types, since elements can be nested inside copies of each other or themselves. And our finite toolbox of affixes can give birth productively (or perhaps ‘extra-lexicographically’?) to Franken-words such as ‘proto-crypto-anti-re-nationalizationalistificationism’, or ‘extra-lexicographically’, or ‘Franken-words’, which, in theory anyway, have comprehensible meanings.

But not all recursion is created equally. You probably understand phrases like ‘Alexa’s mum’s friend’s sister’, or ‘piles of boxes of packets of crisps’, more easily than ‘The hole that the rat that the cat that the dog barked at chased escaped into was dingy,’ or ‘If the, if I may, if you like, put it this way, price is right, then we have a deal’ — even though in all cases the recursion is only three levels deep. If you were wondering why that is, it’s because syntactic center-embedding is more computationally costly than left- or right-embedding. (According to one theory, anyway. I ought to say that none of the linguistic theories I’ve mentioned are non-contentious.)

People who believe in the operation ‘Merge’ (like Chomsky) believe language is very deeply recursive, as it uses Merge all over the place to build one thing out of multiple constituents: sentences contain phrases contain words contain morphemes contain syllables contain rimes contain codas contain phonemes contain phones contain features. (Others disagree.)

Recursion in programming is, similarly, whenever a function calls itself. A simple example would be a factorial liken.factorial = n * (n-1).factorial, or an array of arrays.

It’s turtles all the way down.

A couple of other patterns relevant to both kinds of branching structures (those in natural and programming languages) are:

  • Inheritance. A ‘child’ object tends to inherit properties from its ‘parent’ class, e.g. all string objects know how to concatenate. In English, quoted speech inherits tense from the main clause, but it didn’t have to be that way; Ancient Greek for example just reports exactly the tense that would have been said: ‘Yesterday Christina complained that she is hungry.
  • Scope. Programming languages have rules about where a variable is ‘local’ to, affecting what parts of the program will be able to parse it. In the English sentence ‘Bill thinks Jill should pat himself on the back’, ‘himself’ can’t be parsed, even though ‘Bill’ is right there at the start of the sentence; ‘himself’ is in a lower level of the sentence than ‘Bill’ and has local scope, so it can’t see further left than ‘Jill’. This kind of thing gets complicated fast. As another example, ‘she’ can refer to ‘Olu’ in ‘As she says, Olu can answer’ but not in ‘She says Olu can answer’, pivoting on whether ‘says’ is taking ‘Olu can answer’ as an argument.

I hope I’ve begun to persuade you that linguists and programmers should be friends. I’m sure that I’ll come across a lot more commonalities in weeks to come. Maybe even similarities to music. I’ll try and keep the length of the posts down in future too, and stay closer to the topic of learning programming! Speaking of which, I wonder if Day 4 of the Advent of Code is live yet…

This is my programming blog. www.github.com/david-mears