← Asking Amanda Palmer about cooperation in the public goods game

Games, culture, and the Turing test (Part I)

March 7, 2013 by Artem Kaznatcheev 13 Comments

Intelligence is one of the most loaded terms that I encounter. A common association is the popular psychometric definition — IQ. For many psychologists, this definition is too restrictive and the g factor is preferred for getting at the ‘core’ of intelligence tests. Even geneticists have latched on to g for looking at heritability of intelligence, and inadvertently helping us see that g might be too general a measure. Still, for some, these tests are not general enough since they miss the emotional aspects of being human, and tests of emotional intelligence have been developed. Unfortunately, the bar for intelligence is a moving one, whether it is the Flynn effect in IQ or more commonly: constant redefinitions of ‘intelligence’.

Does being good at memorizing make one intelligent? Maybe in the 1800s, but not when my laptop can load Google. Does being good at chess make one intelligent? Maybe before Deep Blue beat Kasparov, but not when my laptop can run a chess program that beats grand-masters. Does being good at Jeopardy make one intelligent? Maybe before IBM Watson easily defeated Jennings and Rutter. The common trend here seems to be that as soon as computers outperform humans on a given act, that act and associated skills are no longer considered central to intelligence. As such, if you believe that talking about an intelligent machine is reasonable then you want to agree on an operational benchmark of intelligence that won’t change as you develop your artificial intelligence. Alan Turing did exactly this and launched the field of AI.

I’ve stressed Turing’s greatest achievement as assembling an algorithmic lens and turning it on the world around him, and previously highlighted it’s application to biology. In the popular culture, he is probably best known for the application of the algorithmic lens to the mind — the Turing test (Turing, 1950). The test has three participants: a judge, a human, and a machine. The judge uses an instant messaging program to chat with the human and the machine, without knowing which is which. At the end of a discussion (which can be about anything the judge desires), she has to determine which is man and which is machine. If judges cannot distinguish the machine more than 50% of the time then it is said to pass the test. For Turing, this meant that the machine could “think” and for many AI researchers this is equated with intelligence.

You might have noticed a certain arbitrarity in the chosen mode of communication between judge and candidates. Text based chat seems to be a very general mode, but is general always better? Instead, we could just as easily define a psychometric Turing test by restriction the judge to only give IQ tests. Strannegård and co-authors did this by designing a program that could be tested on the mathematical sequences part of IQ tests (Strannegård, Amirghasemi, & Ulfsbäcker, 2012) and Raven’s progressive matrices (Strannegård, Cirillo, & Ström, 2012). The authors’ anthropomorphic method could match humans on either task (IQ of 100) and on the mathematical sequences greatly outperform most humans if desired (IQ of 140+). In other words, a machine can pass the psychometric Turing test and if IQ is a valid measure of intelligence then your laptop is probably smarter than you.

Of course, there is no reason to stop restricting our mode of communication. A natural continuation is to switch to the domain of game theory. The judge sets a two-player game for the human and computer to play. To decide which player is human, the judge only has access to the history of actions the players chose. This is the economic Turing test suggested by Boris Bukh and shared by Ariel Procaccia. The test can be viewed as part of the program of linking intelligence and rationality.

Procaccia raises the good point that in this game it is not clear if it is more difficult to program the computer or be the judge. Before the work of Tversky & Kahneman (1974), a judge would not even know how to distinguish a human from a rational player. Forty year later, I still don’t know of a reliable survey or meta-analysis of well-controlled experiments of human behavior in the restricted case of one-shot perfect information games. But we do know that judge designed payoffs are not the only source of variation in human strategies and I even suggest the subjective-rationality framework as I way to use evolutionary game theory to study these deviations from objective rationality. Understanding these departures is far from a settled question for psychologists and behavioral economist. In many ways, the programmer in the economic Turing test is a job description for a researcher in computational behavioral economy and the judge is an experimental psychologists. Both tasks are incredibly difficult.

For me, the key limitation of the economic (and similarly, standard) Turing test is not the difficult of judging. The fundamental flaw is the assumption that game behavior is a human universal. Much like the unreasonable assumption of objective rationality, we cannot simply assume uniformity in the heuristics and biases that shape human decision making. Before we take anything as general or universal, we have to show its consistency not only across the participants we chose, but also across different demographics and cultures. Unfortunately, much of game behavior (for instance, the irrational concept of fairness) is not consistent across cultures, even if it has a large consistency within a single culture. What a typical westerner university students considers a reasonable offer in the ultimatum game is not typical for a member of the Hadza group of Tanzania or Lamelara of Indonesia (Henrich et al., 2001). Game behavior is not a human universal, but is highly dependent of culture. We will discuss this dependence in part II of this series, and explore what it means for the Turing test and evolutionary game theory.

Until next time, I leave you with some questions that I wish I knew the answer to: Can we ever define intelligence? Can intelligence be operationalized? Do universal that are central to intelligence exist? Is intelligence a cultural construct? If there are intelligence universals then how should we modify the mode of interface used by the Turing test to focus only on them?

This post continues with a review of Henrich et al. (2001) in Part 2

References

Henrich, J., Boyd, R., Bowles, S., Camerer, C., Fehr, E., Gintis, H., & McElreath, R. (2001). In Search of Homo Economicus: Behavioral Experiments in 15 Small-Scale Societies. American Economic Review, 91 (2), 73-78

Strannegård, C., Amirghasemi, M., & Ulfsbäcker, S. (2013). An anthropomorphic method for number sequence problems Cognitive Systems Research, 22-23, 27-34 DOI: 10.1016/j.cogsys.2012.05.003

Strannegård, C., Cirillo, S., & Ström, V. (2012). An anthropomorphic method for progressive matrix problems. Cognitive Systems Research.

Turing, A. M. (1950) Computing Machinery and Intelligence. Mind.

Tversky, A.; Kahneman, D. (1974) Judgment under uncertainty: Heuristics and biases. Science 185 (4157): 1124–1131.

Filed under Commentary Tagged with Alan Turing, artificial intelligence, cognitive science, culture, intelligence, Joseph Henrich, rationality

About Artem Kaznatcheev
From the Department of Computer Science at Oxford University and Department of Translational Hematology & Oncology Research at Cleveland Clinic, I marvel at the world through algorithmic lenses. My mind is drawn to evolutionary dynamics, theoretical computer science, mathematical oncology, computational learning theory, and philosophy of science. Previously I was at the Department of Integrated Mathematical Oncology at Moffitt Cancer Center, and the School of Computer Science and Department of Psychology at McGill University. In a past life, I worried about quantum queries at the Institute for Quantum Computing and Department of Combinatorics & Optimization at University of Waterloo and as a visitor to the Centre for Quantum Technologies at National University of Singapore. Meander with me on Google+ and Twitter.

13 Responses to Games, culture, and the Turing test (Part I)

himalayanbuddhistart says:

March 8, 2013 at 10:18

Scientific intelligence is only one type of intelligence. Most people can’t work complicated things out but they can survive in a dangerous environment or shine out in an ordinary one. Some people can solve difficult quizzes or beat anyone else at chess but they don’t know how to communicate successfully with their peers or to behave kindly to others. Isn’t intelligence the ability to adapt to circumstances, the capacity to understand one’s environment and react accordingly while retaining one’s dignity?

Reply
- Artem Kaznatcheev says:
  
  March 8, 2013 at 10:34
  
  I hope my post didn’t imply that I was defining a “scientific intelligence” or equating ‘intelligence’ to it. The Turing test is in fact a very bad way to measure “scientific intelligence” as you describe it and the economic Turing test (eTT) actually relies on being able to pick up the emotional and irrational glitches that define human behavior. A perfectly rational scientist would fail the eTT.
  
  In general, the point of my post was to suggest that “intelligence” is an unreasonable loaded term with way too many definitions that keep varying over time and (more importantly) culture. The one you suggest is popular in some circles, for instance on LessWrong. However, I think such a definition is better called “adaptability”. Your exact wording has some difficulties though. How do you operationalize “understanding”? What do you mean by “retaining one’s dignity”? How would you operationalize that? In particular, I think that explaining what you mean by ‘understanding’ is actually more-or-less equivalent to having to define ‘intelligence’ in the first place.
  
  My personal stance is that it is better to avoid the word ‘intelligence’ in technical or academic discussion, and relegate it to the world of folk-psychology alongside words like subconscious.
  
  Reply
mlagamolan says:

March 11, 2013 at 20:09

Reblogged this on mlagamolan.

Reply
Pingback: Games, culture, and the Turing test (Part II) | Theory, Evolution, and Games Group
Terrific T says:

March 18, 2013 at 01:37

I am very interested in the concept of intelligence as defined by the IQ test. I took one when I was little and since then have been curious how the test was developed and whether the result would have been different if I were born elsewhere, grew up in a different environment, used another language, etc. (I know that the tests were supposedly developed independent of language skills, but I wonder whether the fact that I grew up learning Chinese characters, for example, might affect the results). And in the end, does it matter at all? I agree that it is definitely a very loaded term and and tend to be a bit suspicious about putting too much emphasis on the importance of intelligence, especially in research articles where the term is not clearly discussed in the first place.

With regards to Turing test – is it about whether the machine is “intelligent”? Or about whether the machine can act in the same capacity of a human being (an intellectual being)?

And definitely agree that “The fundamental flaw is the assumption that game behavior is a human universal.”

Reply
- Artem Kaznatcheev says:
  
  March 18, 2013 at 03:07
  
  I remember hearing that IQ tests are culture dependent, but I can`t seem to find any research to back that up. It would probably make a good CogSci.SE question. I know that IQ tests were believed to be highly heritable (80% genetics, 20% environment) based on research done with WEIRD participants (see part II of this series for definition) but later shown by Turkheimer et al. (2003) to be much less heritable in settings with a more adverse environment (with 10% genetic and the rest environmental). This suggests that whatever IQ measures has at least a strong developmental component, if not a cultural one. I dunno how language effects this, but you might be interested in looking more into the Whorf hypothesis since it can definitely mess with consciousness, so maybe IQ as well?
  
  Reply
Pingback: Mathematical Turing test: Readable proofs from your computer | Theory, Evolution, and Games Group
Pingback: Four color problem, odd Goldbach conjecture, and the curse of computing | Theory, Evolution, and Games Group
Pingback: Natural algorithms and the sciences | Theory, Evolution, and Games Group
Pingback: Cataloging a year of blogging: from behavior to society and mind | Theory, Evolution, and Games Group
Pingback: A Fistful of Computational Philosophy
Pingback: Algorithmic lens as Alan Turing’s wider impact | Theory, Evolution, and Games Group
Pingback: Techne and Programming as Analytic Philosophy | Theory, Evolution, and Games Group