Updated: November 2017
One common critique of functionalist and illusionist theories of consciousness2 is that, while some of them may be “on the right track,” they are not elaborated in enough detail to provide compelling accounts of the key explananda of human consciousness,3 such as the details of our phenomenal judgments, the properties of our sensory qualia, and the apparent unity of conscious experience.4
In this report, I briefly describe a preliminary attempt to build a software agent which critics might think is at least somewhat responsive to this critique.5 This software agent, written by Buck Shlegeris, aims to instantiate some cognitive processes that have been suggested, by David Chalmers and others, as potentially playing roles in an illusionist account of consciousness. In doing so, the agent also seems to exhibit simplified versions of some explananda of human consciousness. In particular, the agent judges some aspects of its sensory data to be ineffable, judges that it is impossible for an agent to be mistaken about its own experiences,6 and judges inverted spectra to be possible.
I don’t think this software agent offers a compelling reply to the critique of functionalism and illusionism mentioned above, and I don’t think it is “close” to being a moral patient (given my moral intuitions). However, I speculate that the agent could be extended with additional processes and architectural details that would result in a succession of software agents that exhibit the explananda of human consciousness with increasing thoroughness and precision.7 Perhaps after substantial elaboration, it would become difficult for consciousness researchers to describe features of human consciousness which are not exhibited (at least in simplified form) by the software agent, leading to some doubt about whether there is anything more to human consciousness than what is exhibited by the software agent (regardless of how different the human brain and the software agent are at the “implementation level,” e.g. whether a certain high-level cognitive function is implemented using a neural network vs. more traditional programming methods).
However, I have also learned from this project that this line of work is likely to require more effort and investment (and thus is probably lower in expected return on investment) than I had initially hoped, for reasons I explain below.
1 How the agent works
The explanation below is very succinct and may be difficult to follow, especially for those not already familiar with the works cited below and in the footnotes. Those interested in the details of how the agent works are encouraged to consult the source code.
The agent is implemented as a Python program that can process two types of text commands: either an instruction that the agent has “experienced” a color, or a question for the agent to respond to.
Each color is identified by its number, 0-255, such that (say) ‘20’ corresponds to my quale of ‘red,’ ‘21’ corresponds to my quale of something very close to but not quite ‘red,’ and so on.8 Upon being “experienced,” each color is stored in the agent’s memory in the order it was experienced (color1, color2, etc.).
To respond flexibly to questions, the agent makes use of the Z3 theorem prover. Upon receiving any question, all axioms (representing the agent’s knowledge) are passed to Z3, which serves as the agent’s general reasoning system. Z3 then returns a “judgment” in response to the query. These “phenomenal judgments” are meant to instantiate, in simplified form, some (but far from all) explananda of human consciousness.
First, consider judgments about colors — a familiar kind of phenomenal judgment in humans. The software agent also makes judgments about colors. Specifically, the agent judges that each color it has experienced has some absolute ‘value’ (red experiences are intrinsically red and not, say, blue), but (like a human) it doesn’t know how to say what that value is, other than to (e.g.) say whether a color is more similar to one color or another (e.g. red is more similar to orange than it is to blue). This is because the agent’s reasoning system doesn’t have access to the absolute values (0-255) of the colors it has seen (even though they are stored in memory), and it also doesn’t “know” anything about how its reasoning system works or why it doesn’t have access to that information. Instead, it only has access to information about the magnitude of the differences between the colors it has seen. Thus, when asked “Is the 1st color you saw the same as the 6th color you saw?” the agent will reply “yes” if the difference is 0, and otherwise it will reply “no.”9 And when asked “Is the 1st color you saw more similar to the 2nd color you saw, or the 3rd color you saw?” the agent is again able to reply correctly. But when asked “Is the 4th color you saw ‘20’?” it will respond “I don’t know,” because the reasoning system doesn’t have access to that information. This is somewhat analogous to Chalmers’ suggestion that ineffabilty is an inevitable consequence of information loss during cognitive processing, and our lack of direct cognitive access to the facts about that process of information loss.10
This agent design naturally leads to another phenomenal judgment observed in humans, namely the intuitive possibility of an inverted spectrum, e.g. a situation “in which strawberries and ripe tomatoes produce visual experiences of the sort that are actually produced by grass and cucumbers, grass and cucumbers produce experiences of the sort that are actually produced by strawberries and ripe tomatoes, and so on.” For our purposes, we imagine that the agent has spoken to other agents, and thus knows that other agents also talk about having color experiences, knows that they seem to believe the same things about how e.g. red is more similar to orange than to blue, and knows that they also don’t seem to have access to information about the ‘absolute value’ of their color experiences. In that situation, the agent concludes that inverted (or rotated) spectra are possible.11
Finally, another phenomenal judgment familiar to humans is the judgment that while one can be mistaken about the world, one cannot be mistaken about what one has experienced. In the software agent, this same judgment is produced via a mechanism suggested by Kammerer (2016), which Frankish (2016b) summarized this way:
[According to Kammerer’s theory,] introspection is informed by an innate and modular theory of mind and epistemology, which states that (a) we acquire perceptual information via mental states — experiences — whose properties determine how the world appears to us, and (b) experiences can be fallacious, a fallacious experience of A being one in which we are mentally affected in the same way as when we have a veridical experience of A, except that A is not present.
Given this theory, Kammerer notes, it is incoherent to suppose that we could have a fallacious experience [i.e. an illusory experience] of an experience, E. For that would involve being mentally affected in the same way as when we have a veridical experience of E, without E being present. But when we are having a veridical experience of E, we are having E (otherwise the experience wouldn’t be veridical). So, if we are mentally affected in the same way as when we are having a veridical experience of E, then we are having E. So E is both present and not present, which is contradictory…
For details on how this mechanism is implemented in the software agent, see the code.
2 Some lessons learned from this project
In my 2017 Report on Consciousness and Moral Patienthood, I listed a more ambitious version of the present project as a project that seemed especially promising (to me) for helping to clarify the likely distribution of phenomenal consciousness (and thus, on many theories, of moral patienthood).12 I still think work along the lines begun here could be helpful, but my estimate of the return on investment from such work has decreased, mostly (but not entirely) because my estimate of the cost of doing this kind of work has increased. In particular:
- Implementing the proposed mechanisms (e.g. from Chalmers and Kammerer) requires a large amount of “baggage” in the code (e.g. for using a theorem prover) that doesn’t illuminate anything about consciousness, but is required for the code to be set up so as to implement the proposed mechanism. This “baggage” requires substantial programming work, and also makes it more cumbersome to write (and read) a full explanation of how the program implements the proposed mechanisms.
- Before the project began, I guessed that in perhaps 20% of cases, the exercise of finding a way to program a suggested mechanism would lead to some interesting clarification about how good a proposal the mechanism was, e.g. because the proposed mechanism would turn out to be incoherent in a subtle way, or because we would discover a much simpler mechanism that provided just as good an explanation of the targeted explanandum. However, based on the details of our experience implementing a small number of mechanisms, I’ve lowered my estimate of how often the exercise of finding a way to code a proposed mechanism of consciousness will lead to an interesting clarification.
- A project like this would benefit greatly from career consciousness scholars who are more steeped in the literature, the thought experiments, the arguments, the nuances, etc. than either Buck or I are.
- I don’t think a program which implements three (or even five) mechanisms will be enough to learn or demonstrate the main thing I’d hoped to learn/demonstrate, namely that (as I write above) “the agent could be extended with additional processes and architectural details that would result in a succession of software agents that exhibit the explananda of human consciousness with increasing thoroughness and precision [such that] perhaps after substantial elaboration, it would become difficult for consciousness researchers to describe features of human consciousness which are not exhibited (at least in simplified form) by the software agent, leading to some doubt about whether there is anything more to human consciousness than what is exhibited by the software agent…”
- Even if we took the time to implement (say) 10 proposed mechanisms for various features of consciousness, it’s now clear to me that a compelling explanation of those mechanisms (as implemented in the software agent) would be so long that very few people would read it.
For these reasons and more, we don’t intend to pursue this line of work further ourselves. We would, however, be interested to see others make a more serious effort along these lines, and we would consider providing funding for such work if the right sort of team expressed interest.
3 Appendix: Notes to users of the agent’s code
This appendix is written by Buck Shlegeris, who wrote the code of the software agent, which is available on Github here.
In this appendix, I explain some of the decisions I made in the course of the project, and explain some of the difficulties we encountered.
I wrote the code in Python because it’s popular, easy to read, and has lots of library support. The main library we use is the Python bindings for Z3, which is a popular theorem prover.
Almost all of the complexity of this implementation is in the first order logic axioms that we pass to Z3. The rest of the code is mostly a very simple object oriented sketch of the architecture of an agent.
Implementing proposed mechanisms of conscious experience in Z3 was difficult. Expressing yourself in first order logic is always clunky, and Z3 often couldn’t prove the theorems we wanted unless we expressed them in very specific ways. I suspect that a programmer with more experience in theorem provers would find this less challenging.
Also, there were many ideas that we wanted to express but which first order logic can’t handle. I’ll mention three examples.
First, it would have been easier to express human-like intuitions about inverted spectra if the theorem prover could reason about communication between agents, e.g. if it could prove something like “No matter what question system A and system B ask each other, they won’t be able to figure out whether their qualia are the same or not.” This can’t be expressed in first order logic, but I believe it can be expressed in modal logic. Perhaps this kind of project would work better in a modal logic theorem prover.
Second, it’s not very easy to express the fuzziness of beliefs using first order logic. A lot of our intuitions about consciousness feel fuzzy and unclear. In first order logic (FOL), we’re not able to express the idea that some beliefs are more intuitive than others. We’re not able to say that you believe one thing by default, but could be convinced to believe another. For example, I think that the typical human experience of the inverted spectrum thought experiment is that you’ve never thought about inverted spectrum before and you’d casually assumed that everyone else sees colors the same way as you do, but then someone explains the thought experiment to you, and you realize that actually your beliefs are consistent with it. This kind of belief-by-default which is defeatable by explicit argument is not compatible with first order logic.
Logicians have developed a host of logical systems that try to add the ability to express concepts that humans find intuitively meaningful and that FOL isn’t able to represent. I’m skeptical of using the resulting logical systems as a tool to get closer to human decision-making abilities, because I think that human logical reasoning is a complicated set of potentially flawed heuristics on top of something like probabilistic reasoning, and so I don’t think that trying to extend FOL itself is likely to yield anything that mirrors human reasoning in a particular deep or trustworthy way. However, it’s plausible that some of these logics might be useful tools for doing the kind of shallow modelling that we attempted in this project. Some plausibly relevant logics are default logic and fuzzy logic, potentially combined into fuzzy default logic.
Third, I can’t directly express claims about the deductive processes that an agent uses. For example, Armstrong (1968) is a theory about a deductive process that humans might have; namely, that in certain conditions, we reason from “I don’t perceive that X is Y” to “I perceive that X is not Y.” To express this, we might need to use a logic that has features of default logic or modal logic.
In general, Z3 is optimized for projects which require the expression of relatively complicated problems in relatively simple logics, whereas for this project we wanted to express relatively simple problems in relatively complicated logics. Perhaps a theorem prover based on something like graph search over proofs would be a better fit for this type of project.
4 Sources
DOCUMENT | SOURCE |
---|---|
Aleksander (2017) | Source (archive) |
Armstrong (1968) | Source (archive) |
Bayne (2010) | Source (archive) |
Bennett & Hill (2014) | Source (archive) |
Bjorner (2017) | Source (archive) |
Brook & Raymont (2017) | Source |
Buck Shlegeris | Source (archive) |
Byrne (2015) | Source |
Chalmers (1990) | Source (archive) |
Chalmers (1996) | Source (archive) |
Chalmers (2017a) | Source (archive) |
Chalmers (2017b) | Source (archive) |
Chalmers (2017c) | Source (archive) |
Clark (1993) | Source (archive) |
Cold Spring Harbor Laboratory (2001) | Source (archive) |
Drescher (2006) | Source (archive) |
Feynman (1988) | Source (archive) |
Frankish (2016a) | Source (archive) |
Frankish (2016b) | Source (archive) |
Gamez (2008) | Source (archive) |
Graziano (2016) | Source (archive) |
Herzog et al. (2007) | Source (archive) |
Kammerer (2016) | Source (archive) |
Loosemore (2012) | Source (archive) |
Marinsek & Gazzaniga (2016) | Source (archive) |
Molyneux (2012) | Source (archive) |
O’Regan (2011) | Source (archive) |
Reggia (2013) | Source (archive) |
Rey (1983) | Source (archive) |
Rey (1995) | Source (archive) |
Rey (2016) | Source (archive) |
Shlegeris (2017) | Source (archive) |
Tomasik (2014) | Source (archive) |
Weisberg (2014) | Source (archive) |
White (1991) | Source (archive) |