Dual licensing and exclusive rights

Ross Mounce asks:

How does dual licencing work when one license flatly contradicts the other e.g. CC-BY vs Elseviers “exclusive right to publish & distribute”

Here‘s the problem he’s referring to:

“Elsevier is granted the following rights:

1. The exclusive right to publish and distribute an article, and to grant rights to others, including for commercial purposes. …”

This is indeed very confusing. When the work is first created, the author has (by copyright law) exclusive right to publish, distribute, and grant rights. Then the author makes an agreement with Elsevier that (a) grants Elsevier exclusive rights to do these things – which I take to mean that, they, the author, are excluded from doing these things (and everyone else is too, but they were already excluded by copyright law). But author and Elsevier also made an OA agreement that (b) requires Elsevier to distribute the work under a CC license. Once Elsevier does (b), as required, it no longer has an exclusive right to publish and distribute, because it has granted those rights to others (which it was able to do since the author grants the right to grant rights). (The grant is conditional on attribution, etc.) The wording “grant the exclusive right to publish” is very confusing since Elsevier by OA agreement is required to turn right around and relinquish some of that exclusivity.

Elsevier might retain the right to publish and distribute in ways other than what they licensed, e.g. without attribution, or without a statement of the CC license. That is, it is free to dual license, while the author is excluded (by contract) from dual licensing. But that in no way negates the CC license, which is irrevocable.

But I don’t know what the OA part of the agreement looks like. If it doesn’t say that every copy that Elsevier makes must carry the agreed CC license, it’s not worth very much as an OA agreement, since the door would be open (legally) to dubious practices such as the one you observed: they can meet the OA agreement by publishing with the CC license for a year, say, and then remove the license statement. Then people who don’t know about the CC license will be tricked into paying them money, even though they don’t need to. (The absence of a license notice does not imply the absence of a license.)

To answer your question: (1) The statement about exclusive rights is not a license, it’s part of a contract that the author has with the publisher, where the author agrees to give up rights (be excluded) in exchange for something else. It has no bearing on users of the material; there is no dual licensing here. Exclusivity comes from copyright law, not from any proclamation the publisher makes. (2) Even if this were a dual license, licenses cannot take rights away, so there is no way that any license can contradict or modify any other license. If you can do A because of license X, and you can do B because of license Y, and you can do A and B, no matter what the licenses may pretend to say about prohibitions on A and B. Prohibitions in a license can only be conditions on the exercise of rights: ‘you can do A if you do P’ (e.g. you can copy if you attribute) does not mean ‘if you do A you have to do P’ because you can perfectly well do A without P if a different license lets you. ‘You can do A only if you do P’ or ‘Joe has exclusive rights’ would be a prohibition, and a license, no matter what it claims, cannot globally prohibit anything that was not already prohibited (by copyright law).

It is possible for a contract to prohibit a party to the contract from doing something. This is why libraries are prohibited from doing things with journal articles (like text mining) that would otherwise be permitted under copyright law or a CC license.

(Also note that when you pay for access to an article, that has nothing to do with copyright. The ‘license’ granted to you in exchange for payment is for access, not the ability to copy. And a CC license does not require anyone to make the material accessible.)

IANAL, TINLA, etc.

Categories: Uncategorized

How can you tell whether a robot is referring?

2015-02-06 1 comment

In brief: I still don’t know yet.

I keep saying I’m going to work up to an examination of reference and objects. I’m not ready for this yet, but I wanted to put down a few thoughts.

Recall by way of motivation that on first encountering the so-called ‘semantic web’ and its dogma of ‘identification’ I felt that it didn’t belong in an engineering context without further explanation. When I expressed my discomfort to one of the principals, he challenged me to fix it.

I’ve claimed that propositions and the semantics (or pragmatics) of complete messages can be put on a foundation solid enough for scientific and engineering analysis. The question then is whether we can do something similar for reference. I put forth two explanations for this: one in terms of state spaces, and the other in terms of engineering specifications.

The state space explanation says that a proposition is a bipartition of the state space of a system (or world) into a block of states in which the proposition is true, and a block in which the proposition is false. State spaces are familiar in all kinds of systems analysis and are well within the comfort zone of engineers and mathematicians. Propositions can be related to one another, and whether an agent generates a message is itself a proposition. So this seems tidy.

The engineering-specification explanation says that a proposition is something that can be tested. We can say that a message means a proposition if, roughly speaking, the message is generated when, and only when, the proposition is true. This kind of condition is fine as a specification if we can determine when the proposition holds, and many propositions are amenable to such a test – and the ones that aren’t are ones we probably don’t or shouldn’t care about.

So if someone claims that by message M, agent A means proposition P, we have ways to test whether it does – we don’t have to take such a claim on faith, and we don’t need to introspect to get the answer. It just does or doesn’t, and this can be determined experimentally.

The problem with reference (or the meaning of a noun phrase; I’m not going to bother yet with Frege’s sense/reference distinction) is how to get a comfortable corresponding story around claims of the form: by generating a message that has message part Z, agent A is referring to X. Suppose there were to be a dispute over the claim that by Z, A refers to X. How would it be settled? Not by repeating the claim, and not by introspection or projection, I hope.

This is an especially severe question when A is an engineered artifact (what I’ve been calling a ‘robot’) that is doing the putative referring (i.e. is sending the message that has the putatively referring part Z). How can you tell whether a robot is referring?

I take as given that robots *can* refer, since I believe that humans as language-speaking agents are different only in degree, not kind, from robots. There is no secret sauce that only humans have that lets them refer.

My homework: make another assault on On the Origin of Objects by Brian Cantwell Smith.

Pointless logics

I happened on a passage in Wikipedia calling out a connection between description logic and propositional dynamic logic (PDL). Both of these formal systems are pointless; they don’t have reference in the usual form. This is certainly appealing for someone trying to eliminate and rebuild reference. For any proposition, there is an implicit subject, an ‘individual’ in the DL case and a ‘world state’ in the PDL case. One can ‘talk about’ a new subject by saying what operator to apply to get from the current subject to the new one. You don’t refer to a subject, you give a path to access a new subject from an old one.

Objects as spindles

Here is something I keep picturing. Sentences that share a common subject (phrase) mean propositions that are all about the same thing. (Modulo homonyms that is, but please let me ignore those.) So we might say, as a way to eliminate referents from the account, that an object is just what a particular collection of related propositions is about. Think of the object as corresponding to a spindle, and the propositions about it are all impaled on the same spindle. The purpose of objects might be to organize propositions.

Inferentialism

I like inferentialism, and was interested to hear the inferentialist take on reference. Jeremy Wanderer’s book on Brandom talks about the ‘the challenge of subsentential structure’ – that nails the problem. But it then goes on to repeat Quine’s idea (in Use and its place in meaning) that substitution of one phrase for another, or coreference, is the best one can do by way of explanation. I find this to be very unsatisfying. If in a given language there were only one phrase that could refer to X, then we would have no account at all of the meaning of that phrase, which is absurd.

Categories: Uncategorized

Cite, citation

  1. The officer cited Bob for speeding.
  2. The officer issued Bob a citation.
  3. Alice cited Bob in her argument.
  4. Alice cited Bob’s paper in her argument.
  5. Alice’s paper had a citation to Bob’s paper.

Could these usages all be instances of a common pattern? I claim that they are. I cite the OED as my first witness. Here is what it gives as the first definition of cite:

To summon officially to appear in a court of law, whether as principal or witness.

The etymology given is Latin citāre to move, excite, or summon.

In example 1, Bob is the principal in a case. It’s not the officer’s job to declare guilt or mete out punishment, so he’s telling Bob to show up in court for trial and judgement on the accusation of speeding. Of course we usually default on these citations, thus implicitly pleading guilty, and pay a fine without showing up in court.

In example 2, there is a citation, that is, an act of citing: The officer is citing Bob (demanding that he show up in court). This would ordinarily mean that there is a piece of paper that records the fact that the officer has cited Bob, and that piece of paper is called a citation, and the piece of paper issues from the officer. This follows a common pattern where a proposition (claim, accusation, agreement, etc.) is confused with an expression of it (piece of writing, audio recording, etc.). I suspect there is a word for this kind of metaphorical extension, something akin to metonymy, and if there isn’t there should be.

In example 3, we’re not necessarily talking about a court of law, but a metaphorical court, that of scholarly debate. Alice is not literally asking Bob to show up as witness, but is doing so metaphorically, and if the stakes were high – if the argument escalated to a court case – that is what she would do.

In example 4, Bob has written a paper making some claim, and that claim, Alice says, supports her argument – so she wants him as a witness. Papers are much easier to summon than people, especially when e.g. the person is dead or incapacitated, so Alice cites the paper as a substitute for Bob himself. The paper is analogous to a legal deposition. In the era of the Web, we access sources very easily, by following a hyperlink, so a hyperlink can very usefully serve to satisfy a citation (with the usual provisos about bit rot, digital attacks, and so on).

In scholarly writing, as elsewhere, citation is an act of citing, or (metaphorically) a physical record of such an act. Those little parenthetical or superscript numbers you see in academic writing act as citations only if you know who (or what) is being summoned and why. The why usually comes from the preceding sentence – i.e. the sentence makes a claim and the superscript acts to cite (summon) support for the claim. The what or who comes from the footnote or endnote.

To say, as Wikipedia does today, that the little superscript is the definition of citation is ridiculous. It is just a participant in an act of citation. Even saying that a citation is a kind of reference is I think quite wrong. The expression of a citation makes a reference – to the entity being summoned. But there are many references that are not for the purpose of citation, and a citation is not a kind of reference. A reference can imply a citation just as shouting the name “Bob” can imply a request for Bob’s presence, but the reference to Bob and the request that he come are two completely different things.

As usual I’m going to say my peevish thing about the dilution of language robbing us of useful and deep means of expression, and shifting focus to the silliness of mechanics. Making claims and defending them are the heart of scholarship, and citation is a close partner. Written articles and their superscripts and lists are just mechanics and are in a sense irrelevant – if there were another way to accomplish the claiming and defending, that would be fine, and would not require us to stop using the word “citation”.

Thanks to Ross Mounce and Ed Summers for forcing me to write this down.

Categories: Uncategorized

Question-answering robot

2014-12-15 1 comment

Continuing my exploration of meaning and reference. Some of the following repeats what I’ve already written in this blog, sorry. I’m using the blog as a way to work out the exposition, which is still too long and too abstract.

TL;DR You can’t independently specify or test the meanings of sentence parts. Specifying and testing meaning always reduces to specifying and testing the meanings of complete sentences.

Propositions again

When an agent A sends a message to an agent B, one or both of the following should hold:

  • there is a correlation between what holds (what the world is like) before generating the message and A’s choice of message to generate,
  • there is a correlation between the message that B receives and what holds after interpreting the message.

Without at least one of these, nothing has been communicated.

By “correlate” I mean that the message and what holds vary together. E.g. as the color of a disk varies between green, red, and blue, the message might vary between “the disk is green”, “the disk is red”, and “the disk is blue”.

I’ll call what holds before generating a precondition (of message generation), and what holds after interpreting a postcondition (of message interpretation). When I use these words I really mean least inclusive precondition and most inclusive postcondition, since otherwise the terms are not helpful.

The precondition case covers messages that are simple declarative sentences (“it is raining”). I’ll call such messages “p-messages”. (Being a p-message is not an inherent property of a message. To classify a message as a p-message you have to know something of the sender’s and receiver’s behavior.)

We can experimentally test any candidate proposition for whether it is the precondition of some message being sent. Just vary the agent’s circumstances (i.e. state) and watch what messages the agent sends. If the given message is sent if and only if the proposition holds, then the proposition is the precondition of sending that message.

Imperative sentences (“please close the window”) can be treated in a dual manner; one might call them s-messages and say that the postcondition of interpretation is a specification. Again, the claim that a particular proposition is the postcondition can be tested.

You might ask: Well, what if the precondition of the generated message isn’t met (the system “lies”), or the postcondition of interpretation isn’t met (it “betrays” us)? How can you call something a postcondition of interpretation, when success is not guaranteed? You could say a specification isn’t met, or that a theory is wrong. But in any engineered success is never guaranteed. Things go wrong. Perhaps some constituent part does not live up to its specification, or the system is operating outside of its specified operating zone. You could put the qualifier “unless something goes wrong” in front of everything we say about the system, but that would not be very helpful.

Questions and answers

It’s less clear what to say about interrogative sentences (“what color is the disk?”) and responses to them (“green”).

For the sake of neutrality, and to emphasize their syntactic nature, I’ll call interrogative sentences “q-messages” and their responses “a-messages”, q and a being mnemonic for “question” and “answer” respectively.

Consider a scenario in which a q-message is sent to a question-answering robot, and an a-message is sent in response. To apply the pre- and post-condition framework given above, we need to consider the postcondition of interpreting the q-message, and the precondition of generating the a-message. (I’ll only consider what it takes to specify or describe the question-answering robot, not the agent that communicates with it, to which I grant total freedom.)

What needs to be the case after the q-message is received? Well, an a-message must be sent; but not just any a-message. As with p-messages, for any a-message, a certain precondition must be met. But crucially, the precondition, and therefore the choice of a-message, depends what the q-message is. The question is, what is the precondition of sending a-message ax, given that the preceding q-message was qx? (‘x’ is for ‘syntax’)

If we’re trying to specify the behavior of the robot, we need to specify, for each q-message, what the allowable a-messages are, as a function of the agent’s current state. The robot can choose among these a-messages.

One way to do this is by brute force enumeration. For each qx, write down the function (perhaps nondeterministic) from circumstances to answers ax. The size of the specification is going to be proportional to m*n where m is the number of possible q-messages qx and n is the number of possible a-messages ax.

A better way is to exploit the structure of the robot’s world. When we ask what the color of the disk is, and what the color of the square is, we’re asking similar questions. Each color-inquiring q-message can be associated with a ‘spot’ in the world that can have its color sensed. When the q-message is received, the color state of the spot that it designates can be sensed and an appropriate a-message can be chosen.

Interpretation

It is natural to interpret q-messages as questions, and a-messages as answers, just as p-messages can be interpreted as propositions. This may be difficult or impossible for a particularly perverse robot, but if we are designing one ourselves, our ability interpret messages is something we can control.

The proposition corresponding to a p-message can be inferred by studying the conditions under which the p-message is sent. Things are trickier regarding interpretation of q-messages and a-messages. For a q-message, we can look at how the resulting a-message varies with aspects of the world. If we can find a variable in the world that varies along with the a-message (correlates with it), and doesn’t vary otherwise [except within spans in which a single a-message covers many values – think about this], then we can say that the question is the one that asks what the value of that variable is.

Similarly, we can interpret an a-message as an answer: it is the answer that says that the variable that the preceding question (whatever it is) asks about takes on a value that can elicit the a-message, given that the preceding q-message is interpreted to be that question.

Formalism

There is a tidy way to look at questions and answers using a simple formal veneer.

Any proposition p induces a function pf from world states to {true, false}, defined so that pf yields true when p holds in that world state, and false otherwise. (To spice things up I sometimes say “what holds” or “circumstances” instead of “world state.”) Call such a function a “p-function”.

Similarly, a question q induces a “q-function” qf from world states to values, and an answer a induces an “a-function” af from values to {true, false}. qf determines the value corresponding to a world state, and af tells whether an answer a is acceptable for a given value.

Consider the proposition that q has a as an answer. Call this proposition z. Let qf be the function induced by q, af be the function induced by a, and zf be the function induced by z. Then the following holds:

  zf = af o qf

i.e.

  zf(ws) = af(qf(ws))

Interpreting this, it says a question/answer pair is (like) a factorization of a proposition.

Any formalism is likely to drop the some of the richness of what it models. Real propositions, questions, and answers probably have more structure to them than functions do. Whether enough structure is captured depends on how the formalism is applied. In this context we’re concerned with specification and prediction, and functions may work fine.

questions2

Specifying and testing

It makes sense to specify that a p-message MUST “mean” a particular proposition p – you are just saying that the robot must generate the p-message if and only if p. We can test to see whether the robot satisfies this condition.

Suppose we tried to specify that a q-message MUST “mean” a particular question q. A specification must be testable. How would a claim that qx means q (when the robot interprets qx) be tested? We’d have to see what a-messages were generated in response to qx – they would have to be the ones that “mean” correct answers to q. But to say this, we need to specify that a set of a-messages MUST “mean” a corresponding set of answers. Then, to test whether an a-message “means” a particular answer a, you’d have to send a bunch of q-messages, and for each one, check whether the a-message that comes back is or is not generated, depending on whether the answer a is an answer to the question that q-message “means”. But then you’d have to specify what each q-message “means”. This is circular.

This is therefore not the way to specify the behavior of a question-answering robot. What you have to do is to define a correspondence between q-messages and questions, and a second correspondence between a-messages and answers. Because we’re writing the specification we can simply do so by fiat, by way of exposition, just as in a specification for motor oil you might say ‘define v = 0.0114′ and then use ‘v’ elsewhere in the specification. Simply defining correspondences does not by itself say anything about what the robot has to do. Then, we specify that when a q-message is received, the a-message generated MUST be one with the property that the corresponding answer is an answer to the question corresponding to the q-message that was received.

An alternative, fully equivalent approach would be to specify the behavior of the robot using the formalism. You could define a correspondence between q-messages and q-functions, and between a-messages and a-functions, and say that the generated a-message MUST be one that makes the composition of the q-function and the a-function evaluate to true when applied to the world state. These correspondences give a interpretation of the q- and a-messages that is just as effective as the interpretation where they are questions and answers.

Going in the other direction, when we reverse engineer a question-answering robot, we have to come up with a theory that explains the data. The data consists of q-message/a-message pairs. As we develop our theory, the correspondences of q-messages and a-messages to meaning-like entities (question/answers or q-functions/a-functions) have to be hypothesized and tested in tandem; we cannot understand q-messages in isolation, or a-messages in isolation.

Compositional languages

Given an understanding of question answering, it is very easy to imagine, or design, a language of p-messages that have two parts, one part being a q-message and the other an a-message. (Perhaps some punctuation or other trivial change sneaks in there, but that’s not to the point.) The meaning of the p-message – i.e. the precondition (proposition) that holds when it’s generated – is that the a-message is a correct response to the q-message. The analysis works exactly as it does for question answering.

This particular compositional message formation is an instance the principle of compositionality, which holds when the meaning of a compound phrase (such as a p-message) is nontrivially determined by the meanings of its parts (in this case a q-message and a-message). I say “nontrivially” because in any language where phrases have parts you can always come up with some trivial definition of part meaning and composition – essentially a table lookup – that makes the phrase meaning the same as the composed meaning. Compositionality means that there is some compression going on, and you’re not quadratically just listing all the cases.

Example: q-message “Which way is Lassie running?” + a-message “South.” => p-message “Lassie is running south.”

See also

  • Horwich, Truth Meaning Reality, of course…
  • Yablo, Aboutness, of course…
  • Carpenter, Type-Logical Semantics
  • Wittgenstein, Tractatus Logico-Philosophicus
  • Jeffrey King, The Nature and Structure of Content
  • Gopnik and Meltzoff, Words, Thoughts, and Theories

Afterthought: When reverse engineering there are always multiple theories (or should I say ‘models’ like the logicians) that are consistent with the data; even when you account for isomorphisms. This is certainly true when the world state space is incompletely sampled, as it would be if it were continuous. But I think this holds even when everything is known about the robot’s world and behavior. It is customary, if you have your hands on multiple theories, to choose the simplest one in making predictions (Occam’s razor). (At this point I want to point you at the work of Noah Goodman…)

Afterthought: There’s a book that argues that young children are scientists developing and testing theories of the world. When I remember the name I’ll add it to the list

Categories: Uncategorized

Specifying meaning, part 2

Followon to specifying meaning

Specifications are often written as if they communicate social norms: the diameter of the nail’s shaft must be 4 mm plus or minus .01 mm. This is not to say the nail has taken on a moral obligation, but rather is shorthand for saying that before you claim that X meets specification Y, you must make sure that the must conditions of the specification are met by X.

Last time I talked about specifications for communicating agents. One is tempted to say that the meaning of a message is a property of the message, which is part of the agent (before it is sent at least), in the same way that the diameter of a nail is a property of the shaft, which is part of the nail. So if it is possible to specify that (a) the nail’s shaft diameter must be 4mm, then it should be possible to specify that (b) the agent’s message’s meaning must be that the battery is charged.

There is something funny about putting “meaning” in a position like this. One can measure a nail’s shaft diameter, and as there is general agreement on how to take such measurements, it is unlikely that two parties (say, a supplier and a buyer) will disagree. The specification (a) is objective and actionable. But what about (b); is there a way to make “the meaning of M is P” objective and actionable the way “the diameter of the shaft it 4mm” is? (Remember that P is a proposition such as the proposition that the battery is charged.) Probably not in the sense of “what does agent A mean by message M”. Here “mean” has the sense of “intend” and intention is subjective, i.e. unfalsifiable. The agent could mean X and do Y. What it means (intends) doesn’t matter in an engineering context; only observable behavior such as action or speech matters.

But “the meaning of M is P” could be objective if interpreted as

  1. the meaning of A sending M is P, or
  2. A’s sending M correlates with P, or
  3. P determines whether A sends M, or
  4. A sends M if and only if P.

which are all mostly equivalent. (Correlation is not causation, but it’s hard to tell them apart, and the difference may not matter if your goal is engineering.) Instead of M being the bearer of meaning, the act of sending M is the bearer.

Reporting is now a clear, testable property of the agent in question. You can check for 4. as follows: If A transmits M when P, and does not transmit M when not P, then the meaning of [A sending] M is P.

(This test has to be modified if there is competition between M and other messages for use of the communication channel; failure to send M might just mean that the channel or the agent is preoccupied with other things. But I hope you get the idea.)

If there are too many states to test exhaustively, sample the state space. If this is a seller/buyer situation, the buyer ought to be very skeptical, performing a wide variety of clever tests to ensure that A meets the specification (that it sends M if and only if P). (Perhaps the seller could provide a warrantee, or a mathematical proof, to reassure the buyer.)

Any of the interpretations 2, 3, 4 eliminates the word “meaning” without sacrificing what was originally expressed using that word. All suggestion of intent and propositional attitudes (belief, intent, truth, fidelity) has gone away. We are left with a clearer, more actionable specification.

Compositional semantics

Of course this is the simplest kind of communication: a single message whose meaning is independently specified and testable. More general scenarios have parts: either there are multiple coordinated messages, or there is a single message that has parts. The messages or parts combine somehow to form behavior that can be tested. For example, the correctness of an answer depends on what the question was, and the correctness of the predicate (in a sentence) depends on what the subject is. Parts that can’t necessarily be tested in isolation can combine to form wholes that can be.

Consider the problem of specifying a question-answering robot. Suppose the robot ‘understands’ (accepts, responds to) two ‘question’ messages, DZ and QS, and it can answer each question in any of three ways, RD, GE, or UL, with the correct choice depending on its circumstances. Suppose that the robot has a camera, and in its visual field are a disk and a square; that the disk and square can independently be either red, green, or blue; and that the robot can sense the color of the disk and square.

We want to specify the following:

question answer if and only if …
DZ RD the disk is red
DZ GE the disk is green
DZ UL the disk is blue
QS RD the square is red
QS GE the square is green
QS UL the square is blue

That is, the answer in the answer column is the correct (specified) one, if the question is the one in the question column and the condition in the third column holds.

The claims (that the disk is red etc.) and therefore the correctness of an answer can be verified through testing or through other objective, skeptical means, as above. Color is notoriously difficult to pin down, and if “is red” is too vague for one of the negotiating parties, a more precise condition can be negotiated, such as a function of the readout of a sensor operating under controlled conditions; and the better condition can be recorded as part of the specification.

We can enumerate all possible question/answer pairs like this, giving the necessary and sufficient condition for each pair, although it gets tedious if there are more than a few questions and answers. Looking at this list it is much more appealing to take the questions and answers as having meaning in themselves. It appears that DZ asks what the color of the disk is, QS asks what the color of the square is, and RD, GE, UL express that the color (of whatever is in question) is the corresponding color.

Folk semantics says that the questions and answers have meanings, and that the sender and receiver should know these meanings. One might say that the meaning to the sender is (one hopes) the same as the meaning to the receiver. But statements like these are impossible to assay and of little use in a specification. How do you test to see whether a robot gives any particular meaning to some question?

I’ve suggested a way to specify the meaning of a complete message – give the preconditions for generation, or postconditions for interpretation. Maybe we can reduce the question of the meaning of message parts, or dialog parts (such as a question/answer pair), to the case that’s already been solved.

Remember that a specification is part of a negotiation between two parties, say a seller and a buyer. The seller and buyer are the ones that have agreed, or hope to agree, on whether the question-answering artifact (robot) meets the specification. This boils down to agreeing on how to evaluate the robot for conformance. They need to have a shared method to go from a question and an answer, to a test that helps determine whether or not the answer is the correct answer to the question. (As before this is a simplification, since we have guarantees and so on, but the idea of objectivity remains no matter how complicated things get.)

One way is for the two parties to agree between them on the meanings of questions and answers, and then to agree that a question/answer scenario is tested in given circumstances by combining the question with the answer in some agreed manner, yielding a test (or a testable proposition).

That is, they agree that DZ “means” to ask what is the color of the disk (and so on), RD “means” that the color under consideration is red (and so on), and the method of the combination of Q with A is the obvious one: the proposition to test is the proposition that the answer that A “means” is true, given that what Q “means” (e.g. what the color of the disk is) is something the answer applies to.

I use “means” here but this is only to play with your mind; it could be any placeholder relationship, since its only purpose is to be introduced (by the spec) and then discharged (in an application of the combination rule in the spec). We could just as well have said DZ “is associated with” what the color of the disk is, and RD “goes along with” that the color is red. There is no objectivity to these relationships; they are just local definitions within the specification, obtaining their truth and legitimacy only by private agreement among users of the specification, and meaningful only in the context of the specification. And importantly they only address evaluations of the behavior of the robot, not the robot’s implementation. There is no need to appeal to “representations” or “intent”.

question is associated with
DZ what color the disk is
QS what color the square is
answer goes along with
RD the color is red
GE the color is green
UL the color is blue

Now we have two tables with a total of five entries, instead of one table with six entries, that is, M+N things to explain in the spec rather than M*N. Not only does this yield economies in implementation and verification but it means that question/answer pairs that have never been seen before in design or training can have agreed semantics – which I take to be the heart of compositional semantics.

This “association” approach to meaning may sound completely vacuous, and it is meant to sound that way. But going this route is very different from the folk semantics I gave above. In the folk semantics, meaning resides in the robot or in what it says. But in the preceding treatment, meaning (of questions and answers, or of message-parts) is only part of the dialog between the users of the specification (seller and buyer, or whatever). Meaning is taken up, in a controlled way, in the specification, not in the robot.

Notes

Here are three wonderful references for all you fellow meaning skeptics:

The idea of equivalence of question/answer pairs and declarative sentences is from Yablo (Aboutness, section 2.2).

Horwich (Truth, Meaning, Reality) inspired me to look for a deflationist treatment of compositional semantics.

I cheer when I read Reddy’s article “The Conduit Metaphor”, which argues that the widespread metaphor of messages as carriers of meaning is not helpful.

To be written: what all this has to do with vocabulary specifications, IAO, linked data, and httpRange-14.

Categories: Uncategorized

Specifying meaning

Specifications

A specification articulates a property or constraint that an artifact may or may not satisfy. That is, you have some class of artifacts, like screws, to which a given specification might apply, and you can ask whether or not any particular artifact meets the specification. A specification could cover physical objects (like screws) or substances (motor oil), pieces of software, or even a process carried out by people (as in ISO 9000).

A specification may be used as a guide in choosing or constructing an artifact, can be used in an offer to sell something, and so on.

The key idea is spec as class or category: a given artifact meets the spec or doesn’t. There may be effective tests for whether something meets a specification. If a spec says that a steel rod has to last for ten years, it is neither practical nor sensible to wait ten years to see if a given rod meets that spec. But perhaps the test has been performed on identical rods, or maybe there are proxy tests that we can perform, such as accelerated aging, that will give a good indication as to whether the spec will be met by the instance in hand.

In engineering the purpose of a spec is to allow us to combine parts to yield systems that have predictable and useful properties. To put it somewhat tautologically, if X meets spec A, and Y has the property that combining it with something meeting spec A yields a complex having property P, then we can predict that combining Y with X will yield something with property P.

Standards bodies like IETF and W3C exist to cause specifications to be created. They facilitate social interactions in which people, often representing competing companies, can come to agreement to give a name (such as ‘SAE 20′) to a particular specification. This allows anyone to say things like “the thing I manufacture meets SAE 20″ or “I would like to buy something that meets SAE 20″. This shorthand reduces transaction costs (time taken negotiating specifications) and creates markets (by enabling choices among providers).

Communication

W3C and IETF are specifically involved in developing specifications that apply to communication between computers. Communication involves two or more agents playing the roles of sender and receiver, connected by a communication channel that carries messages. So any specification that has to do with communication necessarily constrains some part of a sender-channel-receiver complex: the sender, or the channel, or the receiver, or an agent/channel complex.

A syntactic constraint is a constraint on what messages are or aren’t generated by a sender or interpreted by a receiver. A pragmatic constraint is one that relates what is generated or interpreted to the circumstances (state) of the sender or receiver. (The word “pragmatic” is used in all sorts of ways by various writers. This is how I’m using it.)

For example, a specification document such as that for SVG implicitly bundles two specifications, one for senders and one for receivers. An SVG sender is one that obeys syntactic constraints in what it sends. An SVG receiver is one that interprets SVG documents in a manner consistent with the specification. A receiver that draws a square when a circle is called for would not meet the specification. Since this constraint relates messages to behavior (‘circumstances’), it’s a pragmatic constraint.

Pragmatic constraints on receivers (interpretation) are common in document type specifications, such as those for image formats or programming languages. But specifications involving pragmatic constraints on senders also exist, especially in protocol specifications where a sender may be responding to a request. A weak example of a sender constraint is that an HTTP GET request must not be answered with a 411 Length Required response (since nothing requires a GET request to specify a content-length). A better example is the SNMP protocol. A device that is sent a request using the SNMP protocol (such as ‘when did your operational status last change’), and gives a syntactically correct response containing blatantly untrue information (‘two hours ago’ when actually it was five minutes ago), would not be said to be compliant with the SNMP specification.

Where constraints are not given, designers will exploit the fact to do interesting things. That SVG doesn’t tell senders which syntactically correct messages they can or should generate is the whole point of SVG: you’re allowed, and expected, to use it to express whatever graphics you want to.

In sum, we can specify:

  • syntactic constraints on senders (what’s not generated)
  • pragmatic constraints on senders (preconditions of generation)
  • pragmatic constraints on receivers (postconditions of interpretation)

Languages and meaning

Descriptions of constraints on senders and receivers are usually bundled; a single document describes them together. Comparing a given sender or receiver against constraints means paying attention to certain parts of the description, and ignoring others. This is particularly natural in the case where two agents carry on a dialog; if you are engineering a sender it’s useful to know how a matching receiver might respond to messages. A constraint bundle that applies to both senders and receivers might be called a ‘language’ or ‘protocol’.

One can speak either of the constraints met by a particular sender or receiver, or of constraints prescribed by some documentation; it’s constraints either way, retrospective in one case and prospective in the other.

Communication constraints are message dependent. That is, the constraint on an agent is that its circumstances and state should be a certain way as a function of the message on the channel: for every message M, if M is on the channel, then some constraint C_M depending on M should apply to the agent. If the agent is a sender, the constraint is on states or events leading up to the message being on the channel – what the world has done to the agent. If it’s a receiver, the constraint is on states or events following the message being on the channel – what the agent will do to the world.

(The constraint on a sender could also be on a future state of the sender, in which case the message is a promise.)

The pair of constraints (S_M, R_M) on sender and receiver agents, specific to a message, is a good candidate for the ‘meaning’ of a message relative to a language. It appears, then, that it is possible to specify, or reverse engineer, the meaning of a message when it appears on a particular channel.

When the sender is pragmatically unconstrained (it can send what it likes) and the receiver is constrained (has to do what it’s told), a message or language is ‘imperative’. When the sender is pragmatically constrained (must give information about its circumstances) and the receiver is not (can do as it likes with that information), a message or language is ‘declarative’.

‘Knowledge representation’

The official W3C specifications for RDF are very weak and impose no pragmatic constraints. An agent that sends RDF messages (‘graphs’) is not out of spec for sending any syntactically correct RDF under any circumstances; nor is an agent that consumes RDF constrained in what it does having received it.

(There are so-called semantic constraints for RDF and OWL: every graph is supposed to be “consistent.” But this is effectively a glorified syntactic constraint, since it can be decided without touching pragmatics.)

The are some pragmatic constraints around SPARQL, but these only dictate that a SPARQL server should act properly as a store of RDF graphs – if something is stored, it should be retrievable.

The interesting thing that RDF (and OWL) do is to suggest that RDF users may create secondary specifications to apply to senders of RDF. Such constraint sets are called “vocabularies” or “ontologies”. An agent conforming to a vocabulary will (by definition) not generate RDF messages at variance with that vocabulary. If we take RDF and further constrain agents by a set of vocabularies, what we get is a declarative language, something much more like SNMP than it is like SVG.

For example, the specification for the Dublin Core vocabulary effectively says that if [dc:title “Iron”] is in the message/graph, but the resource to which this applies does not have “Iron” as its title, then the sender of the message is not in conformance with the vocabulary. (I’m taking a notational liberty here for the benefit of people who don’t know RDF, and I beg you not to ask what “the resource” is, since the answer is difficult and irrelevant to the example.)

Unlike SNMP, whose pragmatic constraints can be satisfied by a mechanism, it is usually impossible to use an RDF vocabulary correctly without human intervention. A vocabulary-conforming RDF sender almost always obtains information from a human source, either from form input, text mining, or manual curation. In these cases an automaton is only conforming if the human input is correct, so it is the human + automaton complex that should be judged against the specification. By the same token, interpretation, while unconstrained by the vocabulary specification, in most use cases requires human intervention for any useful pragmatic effect. Thus most involvement of computers with RDF is for those applications not requiring generation or interpretation according to vocabularies: storage, search, translation between formats, and inference.

(I say “usually” impossible because it is certainly possible to use RDF in a manner similar to SNMP, where automaton-generated graphs are vocabulary-conforming without human input. But this is not how RDF is ordinarily used in practice.)

So there are three funny things about the “semantic web” languages that make them misunderstood outliers in the world of W3C/IETF specifications.

  1. Unlike nearly every other artificial language (XML and JSON excepted), they have no meaning – no pragmatics are defined by the core specifications. All pragmatics comes from layered specifications.
  2. As practiced, i.e. subject to vocabulary specifications, they are declarative, not imperative; pragmatic constraints from vocabularies are on senders (similarly to SNMP servers), not receivers (as in SVG, C, …).
  3. Meeting the pragmatic constraints of vocabularies typically requires human involvement, meaning that vocabulary specifications are meaningfully applied not to automata but to automata/human complexes.

Reference

I wanted to write about the question of whether reference could be specified, but needed the above by way of introduction. More later perhaps.

Oh, maybe you wanted to know what sources I would want to refer you to as background to this piece of writing. Philosophical Investigations is a propos. And I acknowledge the influence of Larry Masinter and Gerald Jay Sussman, but have nothing of theirs specifically to refer you to.

Categories: Uncategorized

Yablo Aboutness

“Aboutness” – that is, the question of whether, for given X and Y, X is about Y – is interesting in its own right, and is of interest technically, for example in understanding the foundations of web architecture and the semantic web, and of the engineering of tools such as the information artifact ontology. Stephen Yablo’s book on the subject is a delight to read. He takes quirky examples from his personal life and from literature, and he avoids unnecessary jargon. And it provides plenty of useful insight into the question.

Here is how I understand his model:

The world changes, i.e. there are many different conditions or states it might be in. Borrowing language from dynamical system theory we consider a world state space, whose points are all the potential states of the world. As time advances, the actual world traces out some path through this space.

The notions of subject matter, aboutness, and parthood can be modeled using a lattice of partitions of the world state space. Consider some object X. Ignoring all of the world other than X, X has its own states, in its own state space. X is part of the world, though, so its states are determined by the states of the world – a sort of simplification or projection. Recall that a partition of a set S is a set of nonempty sets (called ‘blocks’) such that (a) distinct blocks are disjoint and (b) the union of all the blocks is S. We can take X’s state space to be a partition of the world state space, and its states to be blocks, as follows: Two world states are in the same X-block (they are X-equivalent) iff they differ only in ways that make no difference as far as X is concerned. When X changes, the world moves from one X-block to another, and when X doesn’t change, the worlds stays in its current X-block.

To help grasp the formalism I like to think of the simple case where the world state space is R3. The world state traces out a path in R3. We may sometimes care only about one of the coordinates, say y but not x or z. y is an ‘object’ with a state space isomorphic to R, but we model it as the partition of R3 with one block of world states (points in R3) for each possible state of y. That is, each y-block is a plane parallel to the xz-plane.

The partitions of the world state space form a lattice, so we can speak of the ordering of partitions (called finer-than or coarser-than depending on which direction it’s written), and of meets and joins and all the usual lattice theoretic stuff. For every entity there is a partition, and if X is part of Y, then X’s partition is coarser than Y’s partition. (Intuitively: smaller things have smaller state spaces / bigger blocks.) So coarser-than models parthood. Coarser-then also models “inherence” of a quality in the thing that has that quality: that Fido’s weight “inheres” in Fido means that Fido’s weight’s partition is coarser than Fido’s partition. (I’m using ‘quality’ in the BFO sense, although I probably really mean ‘dependent continuant’.) Similarly, observe that any proposition (e.g. “Fido weighs 10 pounds”) partitions the world into two blocks: one consisting of states in which the proposition is true, and the other those in which it is false. When a proposition is “about” an entity, its partition is coarser than the entity’s.

I find this uniform treatment of objects, parts, qualities, and propositions to be appealing. It helps explain my discomfort with conventional ontologies like SUO. Consider the following four entities:

  1. Fido
  2. Fido’s tail
  3. Fido’s weight
  4. That Fido weighs ten pounds

The SUO top level would say that 1 and 2 are Physical, that 4 is Abstract (because it’s a Proposition), and that 3 doesn’t exist. To me they are all the same kind of thing, just some more “part-like” than others. They ought to be either all Abstract or all Physical. By Yablo’s programme they are just entities with partitions of varying fineness.

Although I’m not always a fan of BFO, it is closer to a uniform treatment. 1, 2, 3 are all continuants. BFO has no propositions (4) but it is not difficult to imagine adding them, and it is pretty clear where they would fit (they would be a particularly “atomic” kind of dependent continuant).

Categories: Uncategorized
Follow

Get every new post delivered to your Inbox.