# Quick introduction: Problems and algorithms

January 26, 2019 2 Comments

For this week, I want to try a new type of post. A quick introduction to a standard topic that might not be familiar to all readers and that could be useful later on. The goal is to write a shorter post than usual and provide an launching point for future more details discussion on a topic. Let’s see if I can stick to 500 words — although this post is 933, so — in the future.

For our first topic, let’s turn to theoretical computer science.

There are many ways to subdivide theoretical computer science, but one of my favorite divisions is into the two battling factions of computational complexity and algorithm design. To sketch a caricature: the former focus on computational problems and lower bounds, and the latter focus on algorithms and upper bounds. The latter have counter-parts throughout science, but I think the former are much less frequently encountered outside theoretical computer science. I want to sketch the division between these two fields. In the future I’ll explain how it can be useful for reasoning about evolutionary biology.

So let’s start with some definitions, or at least intuitions.

A computational problem is a formal mathematical object composed of a countably infinite set of finite problem instances. In the simplest case, each problem instance has some particular answer. For many purposes, we can focus on decision problems with ‘yes’/’no’ answers. In that case, we call the set of problem instances that map to ‘yes’ as the formal language corresponding to the problem.^{[1]} In fact, for decision problems, we use problem and language interchangeably. Since each problem instance is finite, we could make a bijection between them and the natural numbers, and talk of the 1st, 2nd, 3rd, and so on problem instance. Without loss of generality, we can just consider our set of problem instances as the set *{0,1}** of finite (but arbitrarily long) binary strings. Then any subset is a formal language that corresponds to some problem.

An algorithm *A* is some specific finite (although, again, arbitrarily long) description of how to take an arbitrary problem instance and turn it into an answer.^{[2]} For algorithms that only returns ‘yes’ or ‘no’, we can again talk about the language corresponding to the algorithm: the set of all problem instances that the algorithms turns into ‘yes’. We say that a (terminating) algorithm *A* solves a decision problem *L* if

Now, keen readers will hopefully notice something strange. Since *A* has to be finite, that means there are only countably many different *A*s and thus at most countably many different . However, problems correspond to any subset : so, there are uncountably many problems. Thus, there are some problems that have no algorithm that solves them. In fact, by this argument, almost all problems are not solvable — just like almost all real numbers are not rational.

This is an existence argument, but we can make it constructive by providing a particular problem — like the famous Halting problem — that is undecidable, i.e. that we can show as not solvable. And since we just connected to the difference between the infinitude of natural numbers versus reals, it is tempting to turn to Cantor. The typical proof of the undecidability of the Haltin problem works in much the same way as Cantor’s diagonal argument for why the infinity of real numbers is strictly greater than the infinity of natural or rational numbers. Whereas Cantor looks at the diagonal of his mapping and increases each number by 1 (looping around) to create a number not on the list, Turing can look for a halt and ‘increase by 1’ to an infinite loop.

I’ll let you read the details on wikipedia and save them for a later quick introduction.

The important point is that this lets us know that some problems are extremely hard: there is no algorithm to solve them.

One of the goals of computational complexity is to further refine our notions of hard. What if we limit the resources available to our algorithms further — for example, by asking them to run quickly or not use too much work space — then how does the set of languages decidable by algorithms under these restrictions change? While the goal of algorithms is to show that we can solve certain problems quickly and efficiently.

I hope that this is a useful quick intro, dear reader. If you’d like more of these — and want to suggest any particular topics — then please leave your comments below.

### Notes

- It is usually not too difficult to generalize from yes-no questions to more general problems that have a larger domain of answers. Often, you can just use a series of yes-no questions to find answers to non yes/no problem. For example, for numeric questions, you might just do binary search by repeatedly asking “is the answer to problem-instance x greater than y?”. As such, theoretical computer scientists tend to stick to just yes-no questions. A notable exception is on problems that are guaranteed to have ‘yes’ answers. But I’ll save that for another time.
- For simplicity, we’ll assume that our algorithm is deterministic; i.e. it produces the same answer when seeing the same problem instance. But it is important to note that in practice this is not a big assumption: if you want to consider a randomized algorithm running on input x, just take every coin flip the algorithm uses and attach it as a trailing input y: i.e. if do all the coin-flips the algorithm would need ahead of time. Then just run your randomized algorithm on (x,y).

Pingback: Quick introduction: Evolutionary game assay in Python | Theory, Evolution, and Games Group

Pingback: Introduction to Algorithmic Biology: Evolution as Algorithm | Theory, Evolution, and Games Group