Welcome to digital dialogues. We have a speaker today who I think has one of the most interesting minds in the field, and it will be a treat to hear what Allan has to say and take it on board, and our associate director Trevor Muñoz will be introducing Allan. What I'd like to do is to have you all introduce yourselves to begin. Your name and where you're from, and after you do that, I'll ask Trevor to come up. So Stephanie, do you want to start? (audience members introduce themselves) I'm gonna turn it over to Trevor now who will introduce Allan and then we'll get on with the show. For those of you who came in afterwards, My name's Trevor Muñoz. I'm an associate director here. It's my great pleasure to introduce our Digital Dialogue speaker today. Allan Renear is interim dean and professor at the Graduate School of Library and Information Science at the University of Illinois, Urbana-Champaign. Allan also has a long and storied career in the digital humanities. Before he went to GSLIS, he was the director of s scholarly technology group at Brown University where he did some groundbreaking digital humanities research. He wrote some of the, I would say, seminal papers of digital humanities around text encoding and our ideas about documents. I know he's updated some of the ideas about documents. I think we'll hear a little about this today. After leading a digital humanities group at Brown for many years, he went to the Graduate School of Library and Information Science of Illinois and while he's been there he's done a long string of interesting work around data curation, foundational concepts in our understanding of digital systems, digital objects, and this recent work has taken him from digital humanities into considering objects such as scientific data sets, and the systems we use to manage and curate them. As Neil mentioned, Allan has one of the most interesting minds in digital humanities, and I think we'll all benefit from his incisive perspective on things that we thought we knew. So at this point I'll turn it over to Allan to talk about an Eliminativist Ontology of the digital world, and what it means for data curation. So, welcome Allan. Thank you. Thank you for inviting me. Thank you. It's great to be here with my old friends and my new friends. And eliminativist, it's a hard word to pronounce, ontology of the digital world and what it means for data curation. You know, you always get these titles well in advance of the actual talk and you're sure you're going to accomplish so much by the time the talk rolls around. Never quite do, so I'm not going to [inaudible] an Ontology of the digital world, but I will say enough to suggest how a particular kind of Ontology might develop. So this is more like towards an eliminativist ontology of the digital world. It will be a kind of unapologetic, reflective, almost philosophical meditation on the conceptual foundations of information science. As Trevor indicated I was in the workplace in digital humanities for about 20 years. In the last few years I've enjoyed indulging my pension for the philosophy of the things I've been doing for so long. My work such as it is now is so social, I cannot figure out what's mine and what's other people's, and I've practically given up. Most of what I'm presenting here has been collaboratively developed by these people, and probably some others and there are quite a number of papers out there in this vein if you want to read more. But I make the slides and I also am totally responsible, not only for the mistakes and implicities, but for anything that seems just a little over the top, that's probably mine. I'm not sure that my colleagues would agree with everything that I say, but that's the problem when you work collaboratively. Deeply collaboratively. Your really sign on for most of what's being asserted, not necessarily for all of it. I also should give credit. A lot of the projects that you'll see here are funded by NSF and also IMLS located at the Center for Informatics Research and Sciences Scholarship at GSLIS, directed by Carole Palmer. Where do I point this? I feel like a geezer. At the screen? There we go. I think it's fair to say I'm going to be doing ontology. I don't mean a lot by the word ontology. I probably could say conceptual modeling, and that would work just as well, so don't read too much into the word ontology. To make sure that you don't read too much into the word ontology, I'm going to talk a little bit about something I'm not going to do, and that is Meta-Ontology. You may wonder, "why bother?", but you'll see in a minute. Meta-Ontology is, as you can probably guess, about ontology: assertions, analysis, arguments, claims, etc. about ontology. A claim in meta-ontology might be, "when it comes to ontology, there's no fact of the matter." There's no theory independent, society independent fact of the matter. Ontologies are constructed by people, by theories, by shared interests, and so on. That's a meta-ontological claim. It's about the nature of ontology. It's claiming that it's in some broad sense relative. A relativist claim about ontology. Another common meta-ontological claim, well, actually, every meta-ontological claim has, of course, a companion claim that denies it, so here are two meta-ontological claims. One, there's a sharp distinction between science and ontology. Two: there's no sharp distinction between science and ontology. That's a meta-ontological claim. So Willard Van Orman Quine, probably the leading pholosopher of the second half of the 20th Century, was a relativist. He did not believe there is any fact of the matter with respect to ontology. He also did not think there was a sharp line between science and ontology. He was a relativist about everything, so [inaudible] he was a relativist about ontology. These are examples of issues with which I am not going to concern myself. And the reason I'm not going to concern myself with these issues is that they're very distracting. No one ever changes their mind. I no longer think that they're much fun. I also don't think that they are very important. For the most part, no matter what your meta-ontological views, you [inaudible] ontology the same way. Relativists and absolutist do ontology more or less the same way. Those who believe there is a sharp dividing line between ontology and science and those who don't, more or less do ontology the same way. The actual practice of ontology, apart from meta-ontology, I find to be engaging and practical, useful, an important thing to do. So how is ontology done, typically? By people with different, or no meta-ontological views. For the most part we start with our beliefs about the world. The beliefs we actually have. These could be common sense, ordinary beliefs or they could be scientific beliefs. They could be mathematical beliefs. We start with those beliefs and we ask ourselves, "what must there be in the world if these things that I believe are true? "What kinds of things must there be in the world if my beliefs are true? What kinds of relationships do they have to one another?" and, when you make a list of the things that apparently you think that there are in the world, sometimes that list looks too long. It looks like you have some duplicates. Perhaps you've been misled by language, and you have two different words for the same thing. Perhaps you realized that some kind of thing was composed of other things. Perhaps you also discover you don't have enough things on the list, and maybe you were confused by synonyms. I did start by saying you think about your beliefs, your concepts, beliefs, and go on from there but typically it's hard to do that without looking carefully at the sentences that express our beliefs, and that's where the synonyms and ambiguity come in. When we do ontology, most of the time we're thinking about what we believe, but the device that assists us in examining what we believe are the sentences that we use to express our belief. So, starting from that point, we go on to try to create a picture of the world that is consistent and simple and accurate and reflects the world that must be out there if the beliefs that we have are in fact true. And from what I can see, it doesn't matter what your meta-ontology is. That's how you do ontology. That's how a lot of ontology is done. I've decided that if I've ever done meta-ontology in the past, I'm not going to be doing it anymore. I'm sticking to ontology. Those were preliminaries, and maybe this one is a little bit as well. The theme of this presentation is Eliminativism, and the basic idea here is that with respect to our beliefs about the world, respect to our common sense conceptual scheme, some of the things we think exist, don't. Now, you may have encountered this perspective in the past. One place where it's particularly prominent, where it's been called Eliminativism is in cognitive science, where in the last 20 to 30 years a number of cognitive scientists have argued that our folk psychology of desire, belief, action, is profoundly misleading. That in fact, there really are no beliefs, desires, intentions [inaudible]. Instead, there are other things that are more scientifically respectable, that are more explanatory, that will give a better account of the same phenomena in the world that we've been using belief, desire, intention to describe. Those cognitive scientists were characterized as eliminating folk beliefs, and the word elimination, and I think it was really in cognitive science that it became particularly prominent, the reason elimination was important as a concept to the cognitive scientists doing this elimination is that it contrasted with what behaviorists were doing when they reduced beliefs to behavior. So instead of reducing beliefs to dispositions to behave, the more advanced cognitive scientists instead wanted to give an alternative account of folk psychological notions. An alternative account. One that discarded them, in a sense, completely, unlike behaviorists, who were saying, "I'll tell you what belief really is: it's a disposition to behave," the cognitive scientists are alluding to, maybe idealizing a bit or something, were saying, "I'm not going to tell you what belief really is, because there are none." You need to let it go and adopt these other notions, which will find much more service. Most of my intellectual life I have detested Eliminativists. I now find myself on the edge of becoming one. In information science, when we develop models that presumably describe precisely some process, for instance, or some domain and we use a language that is intended to be understood literally, we discover problems that are such that elimination of an entity, of an entity type, becomes a tempting solution. This is, in my experience, particularly the case where our models or ontologies are representing processes that involve change and identity. Eliminativist strategies become very tempting, at least to me. I'm now going to explore some elimination and it's hard to let these things go. You may not want to. Hence, the courage. As I talk now, please feel free to interject at any point, now that we're sort of getting to the interesting part. I'm not sure exactly how long this will take, and how much time there'll be for questions so just speak up if you have a question or clarification, or if you wish to contradict me. If I want to put your contradiction off, I'll just do it. (audience chuckles) So change. We are often told, those of us who've been in digital humanities for a long time, and been through the whole hypertext excitement, and all the excitement around things virtual, are told repeatedly that digital objects are fluid, malleable. More generally, that the digital world is a place of constant change. And even if you're not caught up in the breathless hype of hypertext and virtual worlds and such, it does seem that the digital world is a place of constant change. After all, we add records to databases. We edit documents. Our files get larger and smaller. We add things to our digital collections, and we take them away. A lot of stuff seems to be happening. A lot of stuff seems to be changing in the digital world, and these changes are absolutely essential to the practical work that we do. When we add a record to a database. When we remove an item from a collection. When we edit a document. Those are modifications to digital objects, apparently and you might say it's the whole reason for having digital objects so we can do things like that more easily. So the digital world does seem to be a place of constant change. I'm going to argue that you are, we are all, deluded. Digital objects are absolutely immutable. So, questions before us: When a digital object changes, exactly what changes? If digital objects can't change, then what is really going on in the world when we say, speaking loosely, that they change? And, what is a digital object anyway? So here we go. This is the beginning of the argument that digital objects cannot change. That they are immutable and I can give several different versions of this argument. This is in a way the most general. It relies upon your ordinary intuitions about sentences. Unlike some arguments to the same conclusion, it's not based on set theory or discrete mathematics nor is it restricted to the digital world. Consider the sentence, "I remember Verona." Imagine that it's the first sentence of the first chapter or draft of a novel. Now, suppose the author edits that sentence to read, "I remember, but dimly, Verona". The first sentence of the draft has been modified. It's been changed. It's now longer. I submit that if you weren't on your guard, none of those sentences would have seemed suspicious. The problem is, exactly what got longer? Something used to be three words and is now five words. Seems like it ought to be a sentence. It consists of words, after all, but what sentences would it be? "I remember Verona"? No. That sentence did not get longer. It's true, that sentence at one time consisted only of three words, but it still consists only of three words. That sentence, "I remember Verona", has not gotten longer. "I remember, but dimly, Verona." Is that the thing that got longer? It's true that it's five words, so it's longer than "I remember Verona", but it's always been longer than "I remember Verona." It has not become longer than "I remember Verona." Did the paragraph get longer, or the chapter, or the entire text? The arguments I just gave here apply equally to those things as well. Just think of them as a longer string of words. I'm pausing for just a moment in case somebody wants to interject something. This is the, you might say, the simple argument for immutability for certain kinds of objects. It's reasonable, but I just have to stop and say, "Wait a second! What do we actually mean by modification or change anyway?" I would submit that we mean by modification or change that something loses or gains a property. In the case of the Verona sentence, the point of the last slide is to suggest that there's no plausible candidate on the landscape for that, for a thing of that kind. (audience member) That which the author is trying to project into the mind of the reader. Has that changed? (audience member) Yes. I would posit that as one of the things that's changed. So if you think of writing as a communicative intent of projecting my thoughts into your thoughts, we mediated the paper, word processor or whatever. That's what's changed. I have two kind of conflicting answers to that, and I have a slide devoted to that particular assertion. I don't actually disagree with you on a deep level. The author is, you might say, to use your phrasing, trying to project something else into the mind of the reader, and so there's a sense in which what the author's trying to project into the mind of the reader has changed. I admit that. There's a sense. But it's not the right sense, because the thing that the author was trying to project at time T1 hasn't changed. It's still "I remember Verona." The thing the author was trying to project in T2 hasn't changed, it's still "I remember, but dimly, Verona." The author's trying to project a new thing into the mind of the reader. I would agree with that. There's a slide I wasn't going to show but since this came up I'll show later, that uses a coffee queue to illustrate the same point. We say the first person in line has changed, the first person in line used to be 50 years old, but the first person in line is now 20 years old. We're not claiming that in the interim somebody who was 50 became 20. It's a good classic, and not very often heard response to this, I think. But in the end, we end up agreeing. (audience member) Well, I don't know how much you want to derail-- I don't want to derail your discussion, but I might argue that the first sentence was an imperfect projection to that what I'm referring to, and the second was the most accurate, so it's not this thing T1, T2, it's that the first try was a poor one. So, you may complete this line of reasoning, I think, by saying, "you know, Allan, no one was ever confused about this "in the first place. "We never really thought that there was a thing that changed," and I'll say, I won't contest that, but I'll be suspicious, and part of my suspicion has to do with UML diagrams that I have seen that imply change where both you and I would say there is none. (audience member 2) Is there a philosophically rigorous way for identifying things by their structural location or what we may call, some kind of abstract properties that they have by virtue of the space they mark out within a thing, so for example, the classicists have their [inaudible] of stuff, and they, for them, Plato consists of the following structure where there may be disagreements about the exact wording of line one, but line one of the such and such work is identifiable as a thing, regardless of what words or specific words or characters we think occupy that space, in the same way, if we marked up the text of Moby Dick, what are we saying of the first paragraph? is something that if this were a digital edition we could get an ID, we could put to it, even if we had disagreements about specific words that are in there and I feel as though our intuitions about the first sentence of this thing are sort of along those lines. We're not talking about those specific words, we're talking about that structural piece which then we apply our words that you're going to destabilize. But I wonder if this is just kind of an intuitional way, or if there is a philosophically rigorous way of talking about that and if so, I imagine it wouldn't change what you're saying but it would feel more satisfying if you could speak in those terms. So actually again I think that perspective that you're taking right now was one that was consistent with where I'm going, but it's actually harder than you think to identify the paragraph apart from its particular contents. That is, to identify it in a way that is consistent, but the logic based modeling languages that we typically use, and I think that will become apparent as we go. So where was I? So, what we're claiming, and this is, especially after these two comments is sort of becoming [inaudible] now, but still, we're claiming that the following [inaudible] is false. There exists something x, and if you're familiar with first order logic you know this is a drum roll that's needed right here, because this is what indicates our ontological commitments, the fact that we're using an existential quantifier to say in a serious ontological tone of voice, "there is an x such that x at T1 had length three, and x at T2 had length five. So the claim here is that an assertion of this kind is false, there's no such thing. This is a topic that Aristotle actually takes up in the Physics and also I think in the Metaphysics and here's a quote from the Physics where he's considering a similar problem: "There must be a substrate ὑποκείµενον "underlying all processes of becoming and changing, but what can it be in the present case?" He's asking about something very similar to what we are discussing. What can it be in the present case? We are totally insane, because, guess what? We agree that "the first sentence was three words and is now five" can express a true proposition, so now we're really taken aback, right? But what we deny is that this is the proposition that it asserts. So we know that this sentence can express a true proposition, but we're denying that the proposition that's expressed by this sentence understood as true, is this proposition. The one that has this logical form. And yes, I'm distinguishing proposition and sentence, [inuadible] We're denying that the sentence is literally true and in a way the notion of literal truth is [inaudible] throughout. So continuing in the same vein, the claim is that sentences like “Jane lengthened the first sentence of her novel” are idioms such as the average plumber has 3.2 children. If you were to represent that in logic, if you were doing a logic exercise, you might be tempted if you were in a hurry and it was like [inaudible] to simply say, "well, there is something that's an average plumber, but of course, you're off on the wrong foot already. That would not be the right way to formalize the proposition expressed by that sentence despite the fact that the surface syntax of the sentence might suggest that it is. "There's a scarcity of common sense in the room," I'm not saying there's something which is the scarcity of common sense, but even more ordinary sentences like "Lumbergh revised the TPS memo." (my favorite movie) Sentences like that, yes, they can express true propositions, but the true proposition that they express is not one that looks like "there is an x such that x is the TPS memo, and x was devised by Lumbergh." So it's obvious that the average plumber is a kind of logical fiction, but I don't think it's obvious that the TPS memo is a logical fiction. Our claim is that it is. It is, and that means that if you're going to use a logic based representation language, like RDF, OWL, Classic, whatever your favorite is, you have a lot of work to do to get from sentences like this into a formalism that you can trust. The great biologist, Richard Lewontin, made a little more of a reprise of remarks by Rosenbluth and Wiener, "The price of metaphor is eternal vigilance." If you want to get from an ordinary sentence like this into a representation in a logic based knowledge representation language, and you want to be able to really trust that representation to never lead you astray in inferencing, it's hard. But if you don't get there, you'll be relying on metaphors and idioms and logical fictions, and the price of metaphor is eternal vigilance against confusing yourself. Drawing UML rectangles for things that don't exist. So I'm going to move quickly through some of these slides. Sort of taking the temperature of my audience, I think we've assimilated this basic argument. I'm capable of belaboring things at great length, so I think I'll not. I do want to suggest that if you still find the argument irritating, and are sure there must be some way out, you might see your problem as trying to decide which one of these three things to reject: documents are strings, strings cannot be modified, documents can be modified. You can reject more than one, but why? If you can justify rejecting one, you've gotten around the puzzle that I presented you. But, for each one that you reject, you have an obligation. If you reject the first, you need to offer an alternative definition of document. One that supports modification. If you reject the second, you need to reconcile modification with the extensionality, with the apparent immutability of strings and if you reject the third, then you have to give some account of what's really going on in cases of modification, such as editing. If editing is not the modification of the document, strictly speaking, then what is it? So whichever one you reject, you've got a kind of an obligation in order to make your rejection credible, plausible. Just for fun, I'm going to call this the MITH feud, 2013 and going to ask you, those of you who think, I'm going to ask you which of the assertions in the inconsistent triad you would reject. I'm slowing down just to give you a chance to form your-- They can't all be true, right? To form your opinion. Alright. Who wants to reject one? (man in audience) Documents are not strings. Documents are not strings. The party of documents are not strings? Okay. Who wants to reject "strings cannot be modified"? Wow! Three. Okay. Who wants to reject "documents can be modified"? Interesting. I've never had such an even distribution. With respect to the first assertion, "documents are strings", I have to confess that it was a convenience to some extent to assert that documents are strings. [Karen Rickett] and I first presented this at Extreme Markup, now called Balisage, those are XML zealots, and so we used the XML definition for the XML standard: "A textual object is a well formed XML document if: taken as a whole, it matches the production labeled document..." the only kind of thing that can match the production is a string. It's harder than you might think. I shouldn't say that you might think, but-- it's harder than sometimes, I think to get out of this simply by denying that documents are strings, because most of the definitions of document are text. Even when they're not definitions in terms of strings, are nevertheless similar enough in the right respects that they're also unmodifiable. At this conference, for instance, it's very common to say that a document is a graph, meaning this kind of graph, you know? And they mean that in the mathematical sense. But a graph is a set of tuples, and sets of tuples can't change because sets can't lose their data. [Grambergs] So graphs don't work. If you look closely at FRBR's notion of the expression as symbolic notation, it's pretty much string like, even if it's not a string. A string in the mathematical sense is a function from integers into some domain of elements. The notion of expression is not exactly mathematical, but it's clearly a sequence of elements and our intuitions about the Verona sentence, I think, count against FRBR's notion of expression. Similarly, contextual criticism cancels notion of a text, I also think is not the kind of thing that can be changed. I'm going to come back to that in a minute, so this is not the end of [inaudible]. Strings cannot be modified. Some of you said that strings cannot be modified as things can be modified. So modification on my account is a losing or a gaining of a property. I would claim that a string like "13571" has properties, but it has no properties that it can lose. It has the property of having [inaudible] five tokens, of having one token [inaudible] twice, [inaudible] 35, and so on, but I would say that that string, that string, can't lose these properties. That string cannot lose those properties. We cannot identify a thing that once had one of those properties and later, did not. Now, I realize there are sub-properties a string can have, and lose, for instance, "13571" has the property of being talked about in college [inaudible] and it will lose that property. But that's a pretty thin change, right? That's not a change to the string. That's a change in the relationship between the string and some other thing. It's like you might not be the tallest person in the room, but when the tallest person in the room leaves, you might become the tallest person in the room. Have you changed? I would say no. So the thing about strings is that although they have some properties, all of their inherent properties, they have essentially so they can't lose them. They only have their relation properties contingently. So that is sort of the interesting thing about things like strings. they have some contingent properties, but all of their contingent properties are relational. They have some inherent properties which could count as properties generating modification if you could lose them, but all of their inherent properties are essential, so they can't lose them, so they don't change. In favor of "documents can be modified", we all believe it. It's part of what we say and do. So these last three slides are supposed to suggest that it's not easy to get out of this problem. There are, in my mind, four relatively significant responses. One is to deny that documents are anything of the kind I've been saying they are. That they're material objects in the world, and material objects in the world can change. Another is to say that documents are social objects and social objects can change. Another is to say that every time we edit a document there's actually a new document being created, so documents aren't really changing, but what's happening is that new documents are being created, so this does deny that documents can be modified. The last one, which I gave an asterisk to because I think it's the one-- I'm not an eliminativist, it's the one I'm going for, the string-in-a-role strategy, which argues that documents are things like strings, but they are not just strings, they are strings in a particular communicative role. (man in audience) Can I try another way out? Yeah. (man in audience) So, what if I take the argument that the three assertions are not contradictory? So take a look at the second one. If we think of a string as an element drawn from the set of all possible combinations of characters, then you're simply drawing a new element from that set, so if you look at it from that perspective the three are not contradictory. I guess this is closest to the new document theory, is that when you modify a document, just simply drawing another string from the set, you're not modifying the string. I would say that is the new document theory. Which is I think the most popular response, particularly from [inaudible] computer scientist. It does deny that documents can be modified, which is, I think, that strictly speaking, literally speaking, documents can be modified. [inaudible] So the string-in-a-role strategy. string-in-a-role is somewhat harsh in that it does deny our common sense belief that documents can be modified. It also doesn't just do that, by the way. It also finesses the definition of document in a very subtle and important way. This response claims that a document is a string in a particular role. That in fact, being a document is a property that strings come to have in particular contingent social situations. And here's the finessing, and it's an ontological maneuver, you might say. On this account, document is not a type of entity. Being a document is a role that some entities come to have in particular circumstances. So document is a kind of nominalization of a relationship, the kind of thing you would not express as-- at least it's plausible that it would be inappropriate to express in your UML diagram, it would be inappropriate to have a rectangle for documents. Instead you would have a rectangle for strings, and an arc for being in a documentary role, or something like that. So compare this, and I get the example from Guarino and Welty, this is very well known, the concepts of person and student. A student is a person in a particular role. A person who has enrolled, let's say. But a person is not a role that something else takes on. That's the intuition here. A person can become a student, and later cease to be a student. We'll see this example again in a bit. So just summarizing, documents can enroll. This is consistent with not just Guarino and Welty but also John Searle, if you're familiar with his writing about the ontology of the social world. Documents are strings, but strings are only documents while they are in a communicative role. Because documents are strings, they're going to be immutable. The thing that is a document can't change. I mentioned the burden that one has to bear if he denied modification. How do we give an account of what apparent modification must be? And I know I'm waving my hands a bit at this point, but roughly, when we say that a document is being modified what's going on is that a person or persons comes to prefer a different string for a particular communicative role than the string previously preferred for that role. I think I may have heard that even from a couple of you already. Apparent changes in digital documents, and you can generalize this account to all digital objects, apparent changes in digital objects. Remember the constantly changing digital world I referred to at the beginning. Apparent changes in digital objects are actually changes in us, in the person or persons interacting with those objects. They're not changes in the documents themselves. So what changes when a digital object changes? To answer the question posed earlier, you do. I promised you some Eliminativism. If you find it hard to accept that documents cannot change, and you should find it hard to accept, because it is part of our conceptual scheme, I think. There is another way out. (voice in audience) This is such a relief to me, Allan. (laughter) There's another way out, trust me! You're not going to be happy without it. To rehearse where we are, it is commonly believed that documents can be revised, edited, shortened, lengthened, and modified in various ways. That belief is widespread and deeply rooted. I characterized it as part of our conceptual scheme. Perhaps it is so deeply rooted that it's actually integral to our concept of a document. If that's the case, then we can express this relationship this way. If there are documents, then there are modifiable documents. It may be more natural to say, if there are documents, then they are modifiable. But we've shown that there are no modifiable documents. From the claim that if there are documents there are modifiable documents, and the assertion "there are no modifiable documents", the conclusion is only that there are no documents, and that's elimination. Let me just briefly say that there is another line of reasoning to the same conclusion, that looks at the constricts in discrete mathematics that are typically used to define digital documents. All of those concepts, whether they be strings or graphs or relations all are eventually defined in terms of sets and our standard set theory holds that membership in a set is essential to the identity of the set. Sets cannot lose or gain members. Sometimes mathematicians speak loosely, but when they're not speaking loosely, they do recognize that one set S and one set T are identical if, and only if, they have exactly the same members and that's a forward and back, that's not just at a time. Sets are used to define strings. They're used to define the relations in a-- actually, let me expand on that a bit. In a relational database model we see information as organized in a table. And our textbooks tell us that table is understood as a mathematical relation, which is a set of n sized tuples. We speak of adding or deleting records from tables. That corresponds to adding or deleting tuples from a set, and having the set survive the change. Sets cannot lose or gain elements, whatever they are. The conclusion is documents can never change. You can't add a record to a database. You can't delete a record from a database. Database switched to table here, but it doesn't make any difference. A database table is a relation. A relation is a set. Sets have their members essentially. They can't lose their [inaudible]. Same goes for collections. Collections are often defined as sets. I think I've got them coming up here. Yeah, there we go! Good old [Ed Fox]'s students gave this account of the digital library. Collection is a set. There, they say it. They even use curly braces. If a collection is a set, you can't add anything to it. Nor can you remove it from it. [inaudible] Suddenly all these things that we had in our digital world that we're very familiar with, very familiar, talked about all the time, seem to incorporate logical inconsistency in their very nature. One response is to say, no, it's not inconsistent, it's just that our notion of those things was inadequate and we have to face the fact that you can't add something to a collection. You can't subtract a record from a database. You can't edit a document. That's one response. The eliminativist says, you know what? If you're going to go that far, you give up-- rather than adopt a position that is that repugnant to my conceptual scheme, my notion of a document, a collection, a database, I'd rather just say there aren't any, because the idea of a database table that you can't add a record to is just not consistent with my notion of database table. (man in audience) As a computer scientist, can I offer what I seem to think is an easier way out? When you're modifying a table, this is actually going back to my attempt at going down the path of a new document. What you're doing is you're actually choosing a new relationship whose new properties reflect the differences that adding [inaudible] a table. So when we say we're adding a new table, that's a shorthand for saying we're manifesting a new relationship in which the only difference between this relationship and the previous relationship is a table that is the row that I modified. And again, I actually am not going to contest that view because the point I want to make is that literally speaking, the relation is not modified. (man in audience) Yeah, you're choosing a new relationship from the universe of all possible relationships and when you're saying we modified the table, that's just a shorthand for doing that and I think that doesn't deny the existence of documents or tables or anything else, but gets us out of this jam. So I would say it doesn't get us out of the jam, because what we're agreeing is what's really going on. But, I maintain that a relation cannot-- that you cannot add a record to a relation. (man in audience) That's right, it's a new relation. Well, it's a different relation in a way. So we actually agree, I think. (man in audience) But you're denying the existence of the document. What I'm saying is that if the immutability of relations is repugnant to your concept of a relation, then there is another approach, and that is to deny that there are, I have to say tables in this case. There are database tables. So a database table is a modifiable relation, but there are no modifiable relations. Therefore, there are no database tables. That's how the argument goes. In Khan's original paper on the relational model he sets up this near convergence that we have here. He says something like, I can't remember exactly, but he talks about how an actual database over time is really a function from times to sets, from sets to tuples. And you could say to me, "Allan, you're completely confused here. "A database is a function as Khan says. A database is a function from times to sets of tuples." I'd say, yes, that may be true, but there's still nothing in the landscape that's mutable. So when you start writing assertions or a modeling framework, UML, RDF, whatever, you had better not have variables ranging over tables that are modifiable because that would be a literal interpretation of the sentence you and I have agreed on interpreting with a paraphrase. (man in audience) Yeah, I guess as a computer scientist, if you are working in the domain of functional programming I don't think any of this would seem as a shock. I guess I don't see the cognitive dissonance that should spring in my head, that you're saying should spring in my head-- So I think maybe you're right. That at this point, having talked about the paraphrases [inaudible], the kind of dissonance started dissipating. The problem is most acute when we're trying to actually develop a conceptual model for a repository or a preservation system or a document management system and we're drawing boxes and arrows and have an interpretation [inaudible] watching. The decisions that we have to make are actually hard. Let me take a specific example. So, [Planets?], which is based on PREMIS, has a nice UML diagram of its preservation model. And they classify documents as bitstreams, and they also attach a modification date to the document class but if a document is actually a particular bitstream, then it is not going to be modifiable. You think of the class of bitstreams as the class of every common rhetorically possible bitstream. A document is one of them. That document cannot become some other bitstream. To me, that just says well, there's interesting work to do here, if we're going to have a UML diagram that matches our intuitions a little more closely or that lets us work with these a little better, but my general point is, if you take the sentences we are likely to articulate, and try to represent them in logic base of conceptual modeling language, even if you're pretty good at it, even if you try hard, you will end up just like the Premise Planets people did, not creating a system like you just described in your paraphrases, but creating one that actually has contradictions in it. Most of the time it doesn't matter because there's so much English involved, there's so much human intervention involved, we're able to navigate these problems, but the more we move towards automatic inferencing over our ontologies and over our assertions, the more likely it is that we start to replicate every paradox of the last 2,000 years in these lights-out automated inferencing systems that are just completely unforgiving, that don't understand what we really mean. (man 3 in audience) Can I add an element of time management here-- (man in audience) Feel free to shut me up if you-- (man 3 in audience) No, it's just that we are running out of time, so, Allan, could we shift over to a couple of questions before we stop? (man 4 in audience) I think I need a bit of clarification on what is meant by document because you talked about documents as a sentence, even a database. It seems that it could apply to any sort of digital object. Is that what you mean, and if that is the case, then although I could agree that a digital object or document is definitely a set or a bitstream, I keep on disagreeing that it's a string, because although a string is a set, there are other properties and restraints that are associated with the fact that it's a string. For example, it has a certain order, a sequentiality. And that doesn't exist in every document. First of all, [inaudible] so I think I need a bit more clarification on what you mean by document and [inaudible]. So, I don't have a-- you may have noticed in the beginning. I don't necessarily want to tie myself to any particular account of document, either specific definition or colloquial notion, so I would say, I'll take candidates for what a document is. Presumably, in ordinary circumstances, something like the TPS memo. Something that can be revised, something that can be authored, something that communicates, and when we look at definitions, whether it's an FRBR, or a [inaudible] or the XML standard, we often see accounts of a document that do make it look like a structure of some kind, often a string of symbols. But maybe one clarification, though is that clearly, I'm not focused attention on the repeatable abstraction, not on the material object that embodies the abstraction. (man 4 in audience) Yeah, absolutely. That's why I think [inaudible] really works. I'm not trying to contradict the conclusions that you attempt to draw. I think that you managed to convince me that a document is an immutable object. I just don't think it's a good idea to call it a string, because a lot of documents will not be strings. So I'm going to take an example from what I know better than [inaudible]. I consider them documents in that they are revised, they're edited, they communicate a set of instructions and a lot more, if you will. Although there is definitely some sense of sequence, they do not operate as a sequence only because even though they communicate something that is supposed to happen in time, so there's one event after the other, they also represent [inaudible] events that happen at the same time so you have at least two sequences that are concurrent. And this is not just thinking in terms of a graph. It's not just an overlapping of qualities, it's an ontological problem and you cannot just model it as something that happens in a sequence, in a line. It's not a string. So let me try this as a response. In the end, despite all this talk about strings and such, it's the fact that abstract objects have no contingent inherent properties that drives the argument forward. I refer to specific constructs from discrete mathematics like strings and so on, because they're so common in the books that we read about digital objects. But, however you conceptualize your score, if it's a repeatable abstraction, it's going to be implausible that it's mutable. It is plausible because we talked about [inaudible] score, but for the same reasons given here, after reflection, it becomes implausible that it's mutable. And so, the kinds of paraphrases that we use for strings will also involve social, community intention. Looks like we're shifting change to communities. Is what we're doing. Convention, intention, all that stuff is going to have to happen for us of course as well. Even though I don't have a snappy answer for what a score is, I'm still fairly confident that whatever it is, if it's a repeatable abstraction, it will not have any inherent properties that are contingent and therefore will be modifiable, and therefore its apparent modification will be a social construction. A genuine social construction that's dependent upon our intentional effort as a community. We have to stop there, but if your brain is like mine right now it's racing in a lot of different directions. Our mind. I'm sure that Allan will be at the front of the room to talk to you if you would like to talk more. Let's thank him for a great digital dialogue. (applause)