A worked example of how to write an algorithm
In this short post I’ll show how I think about writing algorithms, which is comp-sci-speak for sequences of steps targeted at a task. I’ll use this example of an initially overwhelming task:
As head librarian, so that students can browse more efficiently, I want to arrange the library so that books on similar topics are near to each other. I need to automate the process.
1. Consider a simple case of the problem and solve it yourself, either mentally or on paper.
For example, a simple case of sorting a library is sorting my bedroom bookpile at home onto my shelf.
2. Watch how your own brain goes about solving the problem. You may notice that there are a few competing ways of doing it; choose just one for the moment.
One way to sort a bookpile by topic is to list a few categories of things I am interested in, and then pass through all the books putting them in the most relevant category. Then if the categories seem too uneven I can apply the same process to the largest categories recursively.
3. Write down the steps that your brain used.
A) Think of a number of topics by looking at the books that I have, e.g. philosophy, linguistics, music.
B) For each book not yet sorted, compare it with each category to find the most relevant category, then put the book in that category. If the most relevant category is too irrelevant, put the book in a Miscellaneous category.
C) Eyeball the number of books in each category.
D) If a category seems too big compared to the average category size, or there are too few categories, apply steps A-D to the largest category.
E) Apply steps A-D to the Miscellaneous category until either you are bumping up against your threshold for how unrelated books in a single category may be to each other, or you are satisfied with the number of categories or of Miscellaneous books.
4. Attempt to directly translate each written step into code or pseudocode. If the step can’t be so translated, break it down further by re-applying 2–4 on that written step. Because of the recursion, prefer a breadth-first translation process over a depth-first translation — you might notice gaps earlier with a breadth-first pass if a depth-first pass goes very deep.
For example, in the library example, step A (think of some topics) seems quite vague, and if you aren’t prepared to do some natural language processing it might make sense to immediately skip to 5 and find an alternative strategy. If you were willing to invest time in some NLP, then the first recursive translation of step A might give you an algorithm such as: find the most common nouns or noun phrases in book titles/descriptions, then select the top N noun(-phrase)s where N is the number of books in the library divided by (the number of books divided by the optimal category size).
5. If the resulting algorithm is likely to be efficient enough for your purposes, stop and have a cup of tea. If efficiency might become a concern at some point return to step 1 or 2 and find an alternative algorithm.
Having completed step 4 you will now have either functioning code or something very close.
If you know about the Dewey decimal system for organizing books, which is a hierarchy of topics for the purpose of topic-sorting that is now somewhat dated, you might decide to accept the trade-off between optimizing students’ facility of browsing by topic and optimizing your time spent reinventing wheels.
The key skill here is in noticing the sub-tasks which higher-level tasks consist of. This comes with practice of interacting with beings that have no commonsense ability to know what you mean, e.g. computers, and overly literal dads (Video: Exact Instructions Challenge — THIS is why my kids want to kill me):