Some thoughts on propagators… seemed like an a-ha moment when I wrote it about a month ago, but now that I reread, it seems sort of trivial… oh well, what are blogs for.
The Sussman/Radul propagator framework is described operationally, as a mechanism for manipulating constraint networks and doing constraint propagation, together with dependency-directed backtracking for diagnosing inconsistencies. I want to redescribe it as a mechanism implementing operations that can be described declaratively: to wit, relational algebra.
Relational algebra was invented to provide a clean theory of databases. According to relational algebra, a database is a set of tables, which are physical and mutable. Each table carries, at any given point in time, a relation, which is ideal and mathematical. When a table is updated, it goes from carrying one relation to carrying another (usually very similar) relation.
Relations in the broader mathematical sense are uniform, potentially infinite sets of tuples. (“Uniform” meaning the tuples all have the same arity, making the relation rectangular, i.e. a subset of a cross product.) In a relational DBMS the relations held by tables can be given by enumerating their tuples. But there is nothing in the theory of relational algebra that requires that relations be enumerable. Relational algebra still makes logical sense with infinite relations, e.g. the less-than relation.
Now the deep piece of relational algebra is relational compositions or joins. Join is a binary operation on relations that yields a relation. The key fact about the join operator is that it’s associative. The most interesting and performance-critical part of any DBMS is the implementation of joins – specifically (a) in what order to perform joins when there’s more than one, (b) for each join, the details of performing that particular join (e.g. enumeration over the left table vs. over the right table). The heart of any DBMS is its query planner, which is the often very hairy compiler-like code that makes these strategic decisions.
So you should see the analogy to propagators. At any given time, every propagator network, including individual propagators and cells as special cases, has an associated relation (e.g., the ternary is-the-sum-of relation). (A cell’s associated relation is unary i.e. a set.)
The propagator network as a data structure changes in two ways: the relation that is expressed (i.e. the set of constraints – values in cells and so on) can be updated, and knowledge about the solution of a particular set of constraints (the propagated implications of the expressed constraints) can change.
Connecting propagators to cells is a side-effecty way of constructing new relational join expressions and therefore expressing new relations. The purpose of the propagation loop is to converge toward the relational join expressed by the network.
The readout of the knowledge state of the cells gives an approximation of the true solution to the constraint set. The would be the cross product of all of the sets expressed by each cell’s knowledge state (e.g. an interval), which contains the true solution relation. In the special case of the most successful possible outcome of constraint satisfaction, the knowledge states of the cells give exact single values for each cell, and the cross product approximation is the true relation, which is just a single tuple. This is the sense in which a propagator network can evaluate functions.
If a relation becomes known to be empty, we have a contradiction, and the TMS can help us figure out why.
So how does this comparison help?
- By showing connections to familiar frameworks (relations and relational algebra), a presentation like this might help in communicating the propagator idea to people (like me) who like declarative and mathematical formulations of things, and might convince them to help join the effort to develop the ideas.
- By thinking about the relations involved one might be helped to think about convergence conditions, case analysis, and other hard problems, and transfer mathematical intuitions developed elsewhere (topology, real analysis, compiler construction, database theory…) to the development of better propagator mechanisms.
Today is (or would be) Preston C. Hammer‘s 100th birthday.
I found a great photo of him on the eBay web site a few days ago, but it seems to be gone now. So it goes.
Today would be a good day to rethink how many empty sets there are, to thoughtfully consider how we reuse everyday words in technical settings, and to remember that our technical work takes place in a broader cultural and societal setting.
Many of his writings are on the web; an incomplete list is linked above. Here’s one I liked: http://mumble.net/~jar/articles/hammer-mind-pollution.pdf
I’m amazed at how much good music there is on youtube – not something I would have expected, given that (a) youtube is supposed to be about videos (b) the copyright holders like to milk recordings for $ … I discovered a trove of Conlon Nancarrow (top search hit, they’re all fantastic) a while back, and today have been listening to some Stockhausen (got to the Helicopter String Quartet via suggestions from Death Metal 4:33, which I found thanks to Bill Tozier), and from there the Rite of Spring … I see Carmina Burana is there too, Messiaen, Penderecki, on and on… Here’s Yo-yo Ma playing the Faure Elegie (although I like the piano accompaniment better than orchestra – voila!) … Let’s see, as a test, here’s an obscure piece that, last I checked, had no commercially available recording: Hindemith’s Sonata for Cello Solo, which I used to try to play… Maybe I’ll become interested in listening to music again?
This post is a continuation of the previous one.
We tend to think of some messages as declarative and others as imperative. For example, a message from a sensor tells the recipient what is sensed, so it seems to be declarative, while a message to an effector tells the recipient what it is supposed to do, so it seems to be imperative.
In symmetric situations, like the contrived example (see previously), the same message might seem declarative when seen from the sender’s point of view, but imperative from the receiver’s.
When designing (or verifying) a system one specifies what is supposed to happen when messages are transmitted over a channel, that is, which messages are supposed to be (or may be) sent by the sender, and/or what the receiver is supposed to do on receiving messages. The desired correlation between system parts is decomposed into one correlation between sender state and messages sent, and a second correlation between messages received and receiver state. The specification of one or the other of these is often called an interface specification. An interface specification either says what the receiver of a message is supposed to “do”, as a function of the message and the receiver’s state, or what the sender is supposed to send, as a function of its state.
This is all pedantic and maybe obvious, but thinking in this way has helped me understand a puzzle. When I was on the W3C TAG we had occasion to talk about both HTML and RDF (at different times). When RDF came up sometimes someone would speak about what “RDF processors” might do, as if such things were somehow analogous to “HTML processors”. This always struck me as very odd and somehow troublesome, but I could never articulate why. I think the reason the phrase “RDF processor” is jarring is clear now. HTML is an imperative language; its specification tells browsers (or processors) what they should do to correctly interpret an HTML message (document). The generator of the message is relatively unconstrained by the specification – it can send any message it likes, as far as HTML is concerned. RDF and its vocabularies are declarative; their specifications tell curators (or generators) what they should do to generate correct RDF. The consumer of the message is relatively unconstrained by the specification. So speaking of “RDF processors” in a discussion of web standards is as peculiar as speaking of “HTML generators”. Standards have almost nothing to say about what either kind of thing is supposed to do. Their correctness is not articulated by any standard, but rather is idiosyncratic to the particular system in which they’re embedded.
This leads to an odd but I think illuminating way of looking at RDF. What does it mean for something to conform to, say, the Dublin Core vocabulary? This is another puzzle that has been bothering me. RDF being declarative, the specification would apply to generators of DC-containing RDF documents, not to consumers. The Dublin Core spec says (in effect) that you shouldn’t write ‘X dc:creator Y.’ unless what ‘X’ means is the creator of ‘Y’ means, and so on. Because this condition is not usually decidable by machine, the “should” (conformance condition) applies not to a piece of software, but to an overall curation process, a system with human parts and mechanical parts, that leads to the generation of such a statement.
- There may be further installments, but I make no promises.
This is a turn in an ongoing conversation with Alan Ruttenberg; exploratory.
A system consists of a collection of parts that interact with one another over some period of time. A jazz concert is a system in which musicians interact with instruments and an audience. A scientific study is a system in which technicians interact with laboratory apparatus to manipulate and perform measurements on subjects. A computer is a system in which electronic components interact via changes in electric potentials, and so on.
Because parts act on other parts, they are sometimes called “agents” (“agent” and “act” having the same etymological root).
Examples of interactions include two rods linked by a joint, so that they are held together at a point but can pivot; a door held closed against a frame by a magnet; a lever used to apply a brake. Interactions induce connections or correlations between parts: when A is connected to B, its state correlates with B’s state. A’s state is affected by what has happened in A’s past, so after an interaction with B, B’s future states are affected by A’s past as well.
Certain kinds of interactions are ordinarily called out as communication interactions or communication events. (Interactions that are not communication interactions, such as those mentioned above, are sometimes called “physical” although of course communication is a physical phenomenon as well.) In a communication interaction a message (the word “token” in semiotics is related) is chosen or created by part A, placed or transmitted on a channel, and received or sensed over the channel by part B. Communication interactions are just like any other kind in that they establish correlations between the states of the interacting parts.
Part A is built so that it generates message M1 in some states, message M2 in some other states, and so on. Part B is build so that when it receives M1, its state alters in one way, and if it gets M2, its state alters in another way, etc. This mechanistic account allows us to apply this pattern of communicating parts to the simplest of agents, such as components of an electronic circuit, procedures in a computer program, or molecules in a cell.
The system consists of Alice and Bob, who are separated by a partition so that they can hear but not see one another; a pair of lights, visible only to Alice; and a pair of buttons, visible only to Bob. When Alice sees the red light go on, she says “sheep”, and when she sees the green light go on, she says “goat”. (Why? Perhaps she is paid to.) When Bob hears “sheep” he presses the black button, and when he hears “goat” he presses the white button.
Or more pedantically: Alice’s state changes from an idle state to “I’ve seen the red light”, then to “I need to say ‘sheep’”, then to saying sheep, then back to idle. Bob goes through a similar series of state transitions. The states of the lights, of Alice, or Bob, and the buttons become correlated with one another, e.g. the red light going on correlates with the black button being pressed a little bit later.
The overall effect of the system is that the red light going on leads to the black button being pressed, and the green light going on leads to the white button being pressed. I think most people would say that this effect is achieved in part through communication between Alice and Bob (the shouting/hearing events).
Observe: Alice does not need to “know” why she is doing what she does; she doesn’t need to know the “meaning” of the lights (other than that she is supposed to say certain words when they go on); in particular she doesn’t need to know what Bob does on hearing what she says. Similarly, Bob needn’t know why Alice says what she says. If you were to ask Alice what “sheep” means (or implies), she might say it means that the red light has gone on; but if you were to ask Bob, he might say that it means he is supposed to press the black button. To us, as outside observers who know all about what is going on, “sheep” means both of these, but to the agents in the system, the word has different meanings, and that’s OK.
So where is “information transfer” in this account?
In the traditional account of communication, not only are the states of A and B correlated, but B comes to have some property that formerly A had – that property is transferred from A to B. The property is usually described as “knowing P” or “having the information that P” where P is some piece of information (sometimes called a “proposition”), and we say that B has “learned” P from A.
(In the Information Artifact Ontology – ignore this paragraph if you’re not part of this debate – this is described differently, but the effect is exactly the same: the property is that of having a part that concretizes some information content entity P; we start with A possessing this property; the message M somehow comes to concretize the same ICE P somehow; and successful communication consists of B acquiring the property (of having a part that concretizes P).)
Since the message M has a special role in such an interaction, it seems closely connected with P, so we say that P is the “information” carried by M (or that M concretizes P) and that M “encodes” P.
We have seen above that property transfer of this sort is not a necessary element of an account of communication, not is it necessary for systems that depend on communication-like processes to function correctly. It may be a useful pattern in certain kinds of theories (e.g. MTRANS in Schank’s script theory), and it may accurately describe many kinds of interactions between rational agents, but it is not essential.
The property-transfer idea raises the ontological problem of figuring out what kind of thing or substance “information” is [see "Use and its place in meaning" by Quine]. Because “information” seems non-physical it is mysterious and problematic from an empirical point of view. Skinner calls ideas like this “explanatory fictions”. In designing or checking the functioning of a system, one if faced with the question: How do we tell whether one of our agents has the information it’s supposed to have? If the agent is a person, perhaps they can be quizzed, or be subjected to brain imaging; a computer might have its memory inspected. Such tests are difficult – they break abstraction barriers, i.e. they require intimate knowledge of the internal structure and functioning of the agents. Such knowledge may not be available, e.g. we may have a computer sealed inside a laboratory instrument with no ability to inspect its internals.
And such tests of the “information state” of agents don’t really get to the point, which is whether the system will work, or does work, as expected. A person might respond correctly in a quiz, and a computer might hold data structures that might be described as “having information”, yet in neither case do we have assurance that the agent will act correctly on the information they supposedly have. Ultimately we don’t really care what messages agents use to communicate, or what they “know”. We care whether the system has particular properties, such as the ability to help us do science.
(Alan now wants me to explain a few things like reference (or denotation) and miscommunication, which I think I’ll have to do separately.)
Updated: “linked” changed to “connected”
With a nod to Peter Hurd, and Quine of course
A few years ago I got interested in “knowledge representation” (which might more accurately but less catchily be called “information expression”) and this got me interested in the mechanics of semantics. I thought I knew pretty well how names, expressions, and statements got their meaning in programming languages and mathematical formalisms, but didn’t understand so well how meaning works in open-ended systems such as scientific discourse and the Internet, so did a bit of research. Below are some books that I thought interesting enough to buy. (I looked at lots of articles, too, see my Pinboard page.)
If I were to wait until I had something to say about all these books, that would be forever, so I’m just offering an unannotated list, in the spirit of Phil Agre’s somewhat longer list.
Jon Barwise and Jerry Seligman.
Information Flow: The Logic of Distributed Systems.
Cambridge University Press, 1997.
Philippe Besnard and Anthony Hunter.
Elements of Argumentation.
MIT Press, 2008.
Pragmatism and Reference.
MIT Press, 2009.
Radu J. Bogdan.
Predicative Minds: The Social Ontogeny of Propositional Thinking.
MIT Press, 2009.
MIT Press, 1997.
Martha I. Gibson.
From Naming to Saying: The Unity of the Proposition.
Truth, Meaning, Reality.
Oxford University Press, 2010.
Jeffrey C. King.
The Nature and Structure of Content.
Oxford University Press, 2007.
Wittgenstein on Rules and Private Language.
Harvard University Press, 1982.
Willard V. Quine.
Theories and Things.
Harvard University Press, 1981.
Philosophy of Language.
Princeton University Press, 2010.