Algorithms To Live By Review
This is my review of Algorithms To Live By, written by Brian Christian and Tom Griffiths. The book aims to highlight with fun and pertinent examples some problems and algorithms for solving them. It is decidedly pop-science, so don’t expect any heavy math - or any math at all. But it was a fun and easy read, suitable for a vacation. And the historical framing of the problems is great no matter your technical level.
The first chapter deals with optimal stopping problems, or so called secretary hiring problems. The setting is that you are presented with a series of choices and must choose one which is the best. Usually you can’t go back on an option you didn’t follow. There’s a bunch of real life situations posed: shopping for a house, finding a partner, hiring someone or deciding to stop in a parking spot. Each adds a twist to the basic setup. And for each a simple solution is provided. For example, for the standard variation, if you select the best one you encounter after seeing 37% of the options you’re guaranteed an optimal solution on average. Counterintuitive, but it’s backed by math.
The second chapter deals with multi armed bandit problems. These are also choice problems, like the optimal stopping ones, only we have to do many choices - usually much more than the options we have. Each choice provides a certain reward to us, but the outcome is probabilistic. Furthermore the outcome distribution is not known to us and we must learn it via the choices we make. So we’re faced with an exploit-explore dilemma at each point - go for options we know give good outcomes or do we go for untried options with even better rewards. The same issue we find in life as well - when deciding whether to go for a restaurant we know is good, or one we haven’t tried for example. In industry the examples are even more abundant - most product companies A/B test their products. And A/B testing is a form of multi armed bandit with the constraint that we always check the two branches equally. Indeed MABs are a generalisation of statistical testing. The two algorithms covered here are Gittings’ index and upper confidence bound ones, and they have a bias towards exploration. Which psychologists have also noticed in humans.
The third chapter deals with the problem of sorting. It actually manages to present Big-O notation in pretty good form for a layman and then introduces bubble sort, insertion sort and merge sort in quick succession. The second half of the chapter is dedicated to sports rankings and how most structures for organising your aments - single elimination, brackets etc aren’t enough to capture the full ordering of teams, but can only reliably pick the best one. But even then, in many sports you get good odds of the weaker team beating the stronger one. And the idea of sorting with a randomness component in the comparator function was presented, which I found pretty neat.
The fourth chapter continues with the standard computer science subject of caching. It covers the memory hierarchies of modern computers and the tradeoffs that lead to them. And then it links them to library organisation - which is quite fitting, cause in both cases we’re dealing with how space and it’s structure affect how we build systems. The final part of the chapter is focused on human memory and how things like cognitive decline can be interpreted like information retrieval without good caches. Or at least with caches tuned to a different world than our current one.
The fifth chapter deals with scheduling, and particularly scheduling on a single machine, to closely tie it with the problems encountered by humans with their own calendars and commitments. The biggest takeaway here was the sync up of OS design for interruptions and preemption with the standard advice for productivity - long stretches of focus on one topic, but with a trade-off of responsiveness in the face of throughout.
The sixth chapter was about Bayes’ Rule and it was a lot of fun. Highlights were the presentation of the types of common distributions: the normal, power-law and Erlang and how they play out wrt Bayes’ Rule wrt predictions: first assumed an averaging out, second a multiplication and third just predicting a constant. It was neat how this all was tied to human predictions and how we intuitively learn and use them. And how we have good predictions for things we have a good model of, and bad ones for things we don’t. And how the modern world’s mechanical distribution of news and information about events skews our models!
The seventh chapter is about overfitting. This is a fairly technical topic in statistics and ML but the authors managed to present it quite decently as a tool to air decision making under uncertainty. In the process they covered techniques such as cross-validation, regularisation and early stopping.
The eight chapter deals with optimization problems and how to solve them approximately by relaxation methods. It covers a bunch of practical applications for TSPs and assignment problems and then delves into constraint and continuous relaxation approaches. The Lagrangian relaxation, where you take constraints and turn them into costly cost elements was also mentioned and it was the first I heard of it, and it seemed pretty neat.
The ninth chapter deals with randomness and how it can help with intractable problems. It starts off with a presentation of Monte Carlo simulation methods and estimating \(\pi\). It then covers Bloom filters (which arguably aren’t random but behave as such for many purposes). The largest part is debited to optimization methods for hard problems and how randomness can help here. Simulated annealing is the focus here, and the whole domain is then likened to creativity and “thinking outside the box”.
The tenth chapter is about networks. It was perhaps the one most distant from real-life applications, mostly an overview of various technologies and protocols, with a final note on bufferbloat.
The eleventh and final chapter is about game theory. Whereas the previous one took us far away from human affairs, this one drops us down straight into the gutter as it were. The interesting thing here is that there’s no need for analogy or translating computer algorithms to human problems because game theory is already about this! A nice result for me was the price of anarchy, which shows that for certain problems (or games) actors acting rationally and selfishly in a local way might not be that far off from the global optimum for the system. This in turn applies to city traffic for example, where we might see just a 30% improvement in traffic via self-driving and coordinated cars - definitely not the massive improvement I was expecting. Another interesting bit of insight was about mechanism design, and how some games are just flawed if we expect participants to act rationally. Things like vacation policies at work, opening hours for competing stores and the way Redwood forests grow are all examples of this at play. The solution here is to have an external enforcer of constraints - something which changes the rules of the game so that this sort of behaviour doesn’t occur. Think governments, regulatory bodies etc.
Finally, the coda of the book is about computational kindeness - realising that a lot of the problems we face in real life in interacting with one another are computational and that we can go a long way in helping each other by making problems easy to solve. Such as offering a constrained choice so another has to just say yes/no rather than a large number of options.
Overall this was a good and useful read and one that I’d recommend to more engineers, computer scientists, data scientists etc. With most of the problems and algorithms I was already familiar, but the book provided a large amount of context and setting for them, and made them more human and useful rather than simply a tool to use next time I want to build an A/B system. The historical material was also very good, helping again with the context and motivation of the problems!