Archive for December, 2014

Question-answering robot

2014-12-15 1 comment

Continuing my exploration of meaning and reference. Some of the following repeats what I’ve already written in this blog, sorry. I’m using the blog as a way to work out the exposition, which is still too long and too abstract.

TL;DR You can’t independently specify or test the meanings of sentence parts. Specifying and testing meaning always reduces to specifying and testing the meanings of complete sentences.

Propositions again

When an agent A sends a message to an agent B, one or both of the following should hold:

  • there is a correlation between what holds (what the world is like) before generating the message and A’s choice of message to generate,
  • there is a correlation between the message that B receives and what holds after interpreting the message.

Without at least one of these, nothing has been communicated.

By “correlate” I mean that the message and what holds vary together. E.g. as the color of a disk varies between green, red, and blue, the message might vary between “the disk is green”, “the disk is red”, and “the disk is blue”.

I’ll call what holds before generating a precondition (of message generation), and what holds after interpreting a postcondition (of message interpretation). When I use these words I really mean least inclusive precondition and most inclusive postcondition, since otherwise the terms are not helpful.

The precondition case covers messages that are simple declarative sentences (“it is raining”). I’ll call such messages “p-messages”. (Being a p-message is not an inherent property of a message. To classify a message as a p-message you have to know something of the sender’s and receiver’s behavior.)

We can experimentally test any candidate proposition for whether it is the precondition of some message being sent. Just vary the agent’s circumstances (i.e. state) and watch what messages the agent sends. If the given message is sent if and only if the proposition holds, then the proposition is the precondition of sending that message.

Imperative sentences (“please close the window”) can be treated in a dual manner; one might call them s-messages and say that the postcondition of interpretation is a specification. Again, the claim that a particular proposition is the postcondition can be tested.

You might ask: Well, what if the precondition of the generated message isn’t met (the system “lies”), or the postcondition of interpretation isn’t met (it “betrays” us)? How can you call something a postcondition of interpretation, when success is not guaranteed? You could say a specification isn’t met, or that a theory is wrong. But in any engineered system success is never guaranteed. Things go wrong. Perhaps some constituent part does not live up to its specification, or the system is operating outside of its specified operating zone. You could put the qualifier “unless something goes wrong” in front of everything we say about the system, but that would not be very helpful.

Questions and answers

It’s less clear what to say about interrogative sentences (“what color is the disk?”) and responses to them (“green”).

For the sake of neutrality, and to emphasize their syntactic nature, I’ll call interrogative sentences “q-messages” and their responses “a-messages”, q and a being mnemonic for “question” and “answer” respectively.

Consider a scenario in which a q-message is sent to a question-answering robot, and an a-message is sent in response. To apply the pre- and post-condition framework given above, we need to consider the postcondition of interpreting the q-message, and the precondition of generating the a-message. (I’ll only consider what it takes to specify or describe the question-answering robot, not the agent that communicates with it, to which I grant total freedom.)

What needs to be the case after the q-message is received? Well, an a-message must be sent; but not just any a-message. As with p-messages, for any a-message, a certain precondition must be met. But crucially, the precondition, and therefore the choice of a-message, depends what the q-message is. The question is, what is the precondition of sending a-message ax, given that the preceding q-message was qx? (‘x’ is for ‘syntax’)

If we’re trying to specify the behavior of the robot, we need to specify, for each q-message, what the allowable a-messages are, as a function of the agent’s current state. The robot can choose among these a-messages.

One way to do this is by brute force enumeration. For each qx, write down the function (perhaps nondeterministic) from circumstances to answers ax. The size of the specification is going to be proportional to m*n where m is the number of possible q-messages qx and n is the number of possible a-messages ax.

A better way is to exploit the structure of the robot’s world. When we ask what the color of the disk is, and what the color of the square is, we’re asking similar questions. Each color-inquiring q-message can be associated with a ‘spot’ in the world that can have its color sensed. When the q-message is received, the color state of the spot that it designates can be sensed and an appropriate a-message can be chosen.


It is natural to interpret q-messages as questions, and a-messages as answers, just as p-messages can be interpreted as propositions. This may be difficult or impossible for a particularly perverse robot, but if we are designing one ourselves, our ability interpret messages is something we can control.

The proposition corresponding to a p-message can be inferred by studying the conditions under which the p-message is sent. Things are trickier regarding interpretation of q-messages and a-messages. For a q-message, we can look at how the resulting a-message varies with aspects of the world. If we can find a variable in the world that varies along with the a-message (correlates with it), and doesn’t vary otherwise [except within spans in which a single a-message covers many values – think about this], then we can say that the question is the one that asks what the value of that variable is.

Similarly, we can interpret an a-message as an answer: it is the answer that says that the variable that the preceding question (whatever it is) asks about takes on a value that can elicit the a-message, given that the preceding q-message is interpreted to be that question.


There is a tidy way to look at questions and answers using a simple formal veneer.

Any proposition p induces a function pf from world states to {true, false}, defined so that pf yields true when p holds in that world state, and false otherwise. (To spice things up I sometimes say “what holds” or “circumstances” instead of “world state.”) Call such a function a “p-function”.

Similarly, a question q induces a “q-function” qf from world states to values, and an answer a induces an “a-function” af from values to {true, false}. qf determines the value corresponding to a world state, and af tells whether an answer a is acceptable for a given value.

Consider the proposition that q has a as an answer. Call this proposition z. Let qf be the function induced by q, af be the function induced by a, and zf be the function induced by z. Then the following holds:

  zf = af o qf


  zf(ws) = af(qf(ws))

Interpreting this, it says a question/answer pair is (like) a factorization of a proposition.

Any formalism is likely to drop some of the richness of what it models. Real propositions, questions, and answers probably have more structure to them than functions do. Whether enough structure is captured depends on how the formalism is applied. In this context we’re concerned with specification and prediction, and functions may work fine.


Specifying and testing

It makes sense to specify that a p-message MUST “mean” a particular proposition p – you are just saying that the robot must generate the p-message if and only if p. We can test to see whether the robot satisfies this condition.

Suppose we tried to specify that a q-message MUST “mean” a particular question q. A specification must be testable. How would a claim that qx means q (when the robot interprets qx) be tested? We’d have to see what a-messages were generated in response to qx – they would have to be the ones that “mean” correct answers to q. But to say this, we need to specify that a set of a-messages MUST “mean” a corresponding set of answers. Then, to test whether an a-message “means” a particular answer a, you’d have to send a bunch of q-messages, and for each one, check whether the a-message that comes back is or is not generated, depending on whether the answer a is an answer to the question that q-message “means”. But then you’d have to specify what each q-message “means”. This is circular.

This is therefore not the way to specify the behavior of a question-answering robot. What you have to do is to define a correspondence between q-messages and questions, and a second correspondence between a-messages and answers. Because we’re writing the specification we can simply do so by fiat, by way of exposition, just as in a specification for motor oil you might say ‘define v = 0.0114’ and then use ‘v’ elsewhere in the specification. Simply defining correspondences does not by itself say anything about what the robot has to do. Then, we specify that when a q-message is received, the a-message generated MUST be one with the property that the corresponding answer is an answer to the question corresponding to the q-message that was received.

An alternative, fully equivalent approach would be to specify the behavior of the robot using the formalism. You could define a correspondence between q-messages and q-functions, and between a-messages and a-functions, and say that the generated a-message MUST be one that makes the composition of the q-function and the a-function evaluate to true when applied to the world state. These correspondences give an interpretation of the q- and a-messages that is just as effective as the interpretation where they are questions and answers.

Going in the other direction, when we reverse engineer a question-answering robot, we have to come up with a theory that explains the data. The data consists of q-message/a-message pairs. As we develop our theory, the correspondences of q-messages and a-messages to meaning-like entities (question/answers or q-functions/a-functions) have to be hypothesized and tested in tandem; we cannot understand q-messages in isolation, or a-messages in isolation.

Compositional languages

Given an understanding of question answering, it is very easy to imagine, or design, a language of p-messages that have two parts, one part being a q-message and the other an a-message. (Perhaps some punctuation or other trivial change sneaks in there, but that’s not to the point.) The meaning of the p-message – i.e. the precondition (proposition) that holds when it’s generated – is that the a-message is a correct response to the q-message. The analysis works exactly as it does for question answering.

This particular compositional message formation is an instance the principle of compositionality, which holds when the meaning of a compound phrase (such as a p-message) is nontrivially determined by the meanings of its parts (in this case a q-message and a-message). I say “nontrivially” because in any language where phrases have parts you can always come up with some trivial definition of part meaning and composition – essentially a table lookup – that makes the phrase meaning the same as the composed meaning. Compositionality means that there is some compression going on, and you’re not quadratically just listing all the cases.

Example: q-message “Which way is Lassie running?” + a-message “South.” => p-message “Lassie is running south.”

See also

  • Horwich, Truth Meaning Reality, of course…
  • Yablo, Aboutness, of course…
  • Carpenter, Type-Logical Semantics
  • Wittgenstein, Tractatus Logico-Philosophicus
  • Jeffrey King, The Nature and Structure of Content
  • Gopnik and Meltzoff, Words, Thoughts, and Theories

Afterthought: When reverse engineering there are always multiple theories (or should I say ‘models’ like the logicians) that are consistent with the data; even when you account for isomorphisms. This is certainly true when the world state space is incompletely sampled, as it would be if it were continuous. But I think this holds even when everything is known about the robot’s world and behavior. It is customary, if you have your hands on multiple theories, to choose the simplest one in making predictions (Occam’s razor). (At this point I want to point you at the work of Noah Goodman…)

Afterthought: There’s a book that argues that young children are scientists developing and testing theories of the world. When I remember the name I’ll add it to the list

Categories: Uncategorized