Werbos:

Werbos: OK. Now let’s see if I can make this coherent. These are very big subjects. I’m going to feel halfway dishonest no matter what I say because I’m going to leave out something critical and it’s kind of frustrating but… but, I guess we know that. I’d also like to emphasize that this is going to be an informal presentation. It doesn’t represent my employer in any way – I’m taking vacation time for this – and it doesn’t even represent the kind of views I fall back on when I do my normal job. Some of this will be guessing, not science.

Velamoor: Again, the rules we are following allow one hour per person – half an hour for the presentation and half an hour for discussion subsequent to the presentation, and then on to the next person. If you’d like to stay within that…

Citron: Tomorrow, we’ll have the time for continued discussion on these presentations.

Werbos: First, some general comments, then. I certainly agree with you that there is a biological basis for intelligence. It seems bizarre that we live in a society where there are taboos that tell you that you can’t study the impact of biology on intelligence because it doesn’t exist. It’s really hard to study something when they want you to pretend that it doesn’t exist. On the other hand, I think it is an extremely complex phenomenon. Fooling around with genes is tricky. It reminds me of the people who ask us “What do you think about drugs applied to the brain?” -- the people who want to take LSD because they think it will make them smarter. I claim that those ideas are just like the theory that you can take a sledgehammer and hit a television and make it work better by hitting it with a sledgehammer.

Velamoor: That’s amazing because that’s precisely the example Walter gave me yesterday about genetic engineering.

Werbos: OK.

Calvin: It’s true for medications too.

Werbos: Exactly. That’s a really serious problem. We have a medical culture today where, let’s face it, people thought frontal lobotomies would do no harm to the brain. Can you remember? This was dogma. It was established belief that you could wipe out the places where advanced planning occurs in the brain and people wouldn’t suffer for it. Now -- given the fact that we have a culture which has a proven track record for doing stuff like that, if we start messing around with the genes, not knowing what we’re doing, we’re in trouble.

The problem here is that the mechanism that we have today in the medical and political establishment to guarantee that we don’t make a mess have a proven track record of not working. We tend to be excessively interventionist. In the old days, there was a tradition of doctors who said that you should always work with nature. They were very, very careful about treatments that would try to disrupt nature. But a lot of those old traditions have broken down, maybe because of all the big money now flowing into medical systems. So now we have the problem that we try to intervene too much.

On the other hand, there are other interventions from the political side, as Dr. Levin mentioned, like the welfare system. I think that the worst problems in that system have been taken care of, but certainly there were problems five years ago in the welfare system which were very, very scary. I mean, there is just no question. That’s another kind of intervention that was causing a problem.

Before we start trying to intervene, one way or another, we really need to work harder to understand this phenomenon of intelligence. Before we start trying to manipulate it we should at least try to get some idea of what it is. So now I want to launch into my own, very narrow view. I will hide in my own professional shell here. I promise you I don’t entirely live within the professional shell, but just to get things started I will talk about the main things I do through NSF, which is try to understand a little bit better what intelligence is. The idea is to understand it first before we start trying to manipulate it. And it may be that it will be 20 years before we begin to understand it enough to be able to begin to ask some of these questions in a really scientific sort of way. This doesn’t mean people shouldn’t try to address some of these questions even before we have the prerequisites, but we have to be very careful. In the meantime, some of us would like to actually understand this thing scientifically even if it takes the rest of our lives to do so.

I agree with a lot of the things that Bill Calvin was saying in his paper and I hope some of you guys have a chance to look at it or read it elsewhere. But I think it’s possible to put some mathematical bones under Bill’s flesh, if I can put it that way. The idea is that the qualitative phenomena he describes... I think there are some mathematical scientific foundations underlying them. And I at least have some ideas on that. But let me start with the general stuff here.

(FIRST SLIDE: WHAT IS NEUROENGINEERING?)

NSF hired me about 10 years ago to run what was a new program in neuro-engineering. The scope has expanded in the last couple of years but I’m still funding this area, doing work in this area among others. It’s in the Engineering Directorate. The goal was to build artificial neural nets to imitate what the brain does, for use in engineering. The folks who hired me said “We want engineering applications to come out of this,” and I said “That’s fair enough, but before I take the job let me tell you I really care more about understanding the mind than I do about engineering applications.”

Since then I’ve changed a little. Some of these engineering applications are really phenomenal. When you folks at the Foundation for the Future start talking about technologies that could change the future… there are a lot of them that go through the engineering directorate at NSF, and a lot of them require these new types of learning systems. But still, even now, my main goal at NSF is to use engineering applications as a testbed for developing the mathematics we will need in order to really understand intelligence in the brain and mind. So even though I’m using engineering applications as a testbed, that’s our reality test. That’s what I use to evaluate progress and set priorities.

I always ask: “Are there engineering applications? Does it work?” I am trying to develop real intelligent systems like the brain.

If you look at the slide, you’ll notice that there is also a feedback from neuroengineering to neuroscience. There are lots of mathematical models that people develop in fields like computational neuroscience or connectionist cognitive science. I have seen cases where some groups are mystified by their very complex 20-equation differential equation model of learning, but an engineer can look at that and say “Hey, wait a minute. I recognize these equations. That one is just a correlation coefficient. That won’t work. I could give you an example in two minutes of how this beautifully sophisticated-looking system couldn’t process what a fly processes – let alone a human.”. So there are a lot of mathematical models out there that are just plain silly, and yet people believe in them and invest a lot of effort. The underlying problem here is that you can’t get a good mathematical model of a system unless you have some idea of how to build systems that performs the same basic function. If you want to reverse engineer a radio, you’ve got to understand the fundamental principles of what it’s trying to do.

Let me try to explain this more by analogy. Years ago, I studied under a guy named Charlie Gross who was a student of Karl Pribram in neuroscience. Charlie gave an analogy at the beginning of his class. He said that the way we do neuroscience today is like we’re studying a radio. How do we try to understand how the radio works? First we pull out a resistor, the radio whines, and we call this the whine center. And after we’ve used up a thousand radios, we have this map of the whine center, the squeak center, the spark center. We’re not learning how the radio works when we go at it that way. You’ve got to go to your mathematical principles and apply them if you ever want to understand what’s going on.

(NEXT SLIDE: NEURAL NETWORKS IN... 3 DISCPLINES)

The next slide shows another way of describing this problem – the reason why we need an engineering input before we can really hope to understand intelligence in the brain.

Years ago, I found out out that we have a communication problem even within the neural network field. We have a guy named Grossberg you may know and a guy named Widrow you may not know who kind of invented modern signal processing. Each of these folks are really innovative pioneers, committed to the neural network enterprise – but somehow, they have real problems sometimes in understanding and appreciating each others’ work.

For my own personal reasons, I have put real energy into trying to understand why such critical and constructive people have had difficulties in being constructive towards each other at times. In my opinion – the underlying problem is that they look at each other’s papers and they think they’re all bullshit. They do not really empathize with the other guy’s standards of validation. So Grossberg looks at Widrow and asks “Where is your biological basis? Isn’t it all just theory?” Widrow may reply,“Well, I’m just trying to make things work. It’s not theory – it works on real systems in the real world.” Widrow may look at Grossberg and ask, “All these equations – why do you think any of them work? It’s just more and more equations that don’t add up to anything.”. The problem is that we’re all using neural network models but we have different standards of validation. One group asks if it will work. Is it functional? One guy asks if it fits behavioral data. Another guy asks if it fits the circuitry of the brain.

My argument is that we can never begin to really understand intelligence as it lives in the brain unless we build models that meet all three of the standards simultaneously. If it can’t generate intelligent behavior, that won’t work – it’s not a brain, no matter how fancy the differential equations are. If they can’t produce intelligent behavior, it’s a stupid model. Likewise, if you can’t generate the observed empirical behavior of animals in animal experiments, it’s not real. And, of course, you do have to address the real circuitry observed in the brain.

Some folks at NSF actually listened to these kinds of ideas a few years back. So they set up a big initiative – Learning and Intelligent Systems – which was supposed to support cross-disciplinary research that would allow people to cross-hatch these validation criteria (and support some other kinds of crossdisciplinary research.) Unfortunately, this is the last year of that activity, so far as I know. It is being supplanted by newer and larger initiatives in areas like web-based computing, which are more reflective of the current hot topics in venture capital investments.

Unfortunately, there was a gap in recording here, precisely when I discussed my views of the nature of intelligence in more detail.

For example –

There is a slide where I listed various aspects of “consciousness” in the human mind.

This subject has been very hotly debated. Many of these debates make lots of claims about what we think in the neural network field. Yet I am not aware of any book on this subject written by one of the real pioneers in neural network research. For myself, I believe at one and the same time that: (1) consciousness and intelligence can both be fully understood in mathematical terms, in principle, and we have made great progress in developing that understanding in recent years; (2) nevertheless, there really is such a thing as a human soul, and an understanding of the brain as such – even in a very subtle way – would not be enough to explain the full depths and capabilities of the human mind. I certainly do not expect other rational people to agree with me on this, and I won’t say a lot more about that subject today. Today I will focus on the more narrow subject of intelligence as such in the human brain (which is large enough as it is!). If you are interested in my thoughts about those larger subjects, look at my chapter in Levine and Elsberry (see slide) and some of the other work which it cites in turn.

Next there was a slide showing three views about “intelligence.”

In one view, “intelligence” is a kind of binary variable. Either you’ve got it, or you don’t. I have a colleague who looks at designs for “intelligent systems,” and always asks:

“But is it conscious or isn’t it? Is it intelligent or isn’t it?”

In another view, “intelligence” is a continuous variable, like IQ. Almost none of the people in my field think that way, but it’s an important part of how some psychologists think. Years ago, there were dogmas about psychology which insisted that humans and snails have exactly the same kind of intelligence, and follow the same path of learning – except that snails are just a tad slower. (I suspect this dogma was motivated in part by a desire to get money for studying snails, and claiming it would be directly useful to understanding humans.) With time, a number of pioneering psychologists like Bitterman have overthrown that dogma, and shown how different types of animal really do have very different kinds of capabilities – but it took a long time for their work to be appreciated by the mainstream.

In my view, both of these views of intelligence are fundamentally misleading – at least when we ask the fundamental question of what intelligence really is, in a larger perspective. (But this is not a comment on the use of IQ to understand differences within the human population, which is another issue.) I would view “intelligence” as something like a staircase or ladder of qualitatively different levels. Maybe it’s a tilty staircase, because there are some significant quantitative differences within each level, but the really important differences are the differences between different types of intelligence.

Now: how can I argue that my view of intelligence is right, and the others are wrong? Philosophers sometimes ask me:”How do you know your definition of the word ‘intelligence’ is right, and the others are wrong?” Well, in a sense, you can’t really say one definition is right and the others are wrong. Definitions of words are really just arbitrary conventions. Nevertheless, when you choose a definition, you can ask yourself which definition is more useful in a practical sense. If you define “intelligence” as “four-toed purple horny frogs,” the definition won’t be very useful, because it doesn’t correspond to anything that actually exists out there in the real world.

What exists out there in the real world that we really call “intelligent?”

See the next slide.

(NEXT SLIDE: LEVELS OF INTELLIGENCE... REPTILE, MAMMAL, ETC.)

In the real world, what we see are different classes of vertebrate, like reptiles, birds, mammals, and so on. There is variation within each class, but Bitterman has shown how there are vast qualitative differences in behavior between these different classes of animal.

Furthermore, there are vast differences in brain design between the different classes of animal.

For example, consider the mammals. When you read some books about the brain, it sounds as if different kinds of mammal really have very different kinds of brain. Maybe a rat would have 7 areas of the brain to process visual images, while maybe the monkey would have 20 or so (I forget the number); it sounds very different. But all of these visual areas they talk about are parts of a common structure called the neocortex. All mammals have this basic organ, the neocortex, and nothing else does. All of this complex image processing is actually based on a universal kind of learning principle, which is present all across the neocortex. When one of these visual areas is damaged, other parts of the neocortex can take over from them. This is called the principle of “mass action,” and it’s been very well understood for decades – in part because of the pioneering work of great neuroscientists like Lashley, Freeman and Pribram. The undrelying principles are all the same – even though some mammals do have bigger neocortices and better eyes than others do.

On this slide, you will see that there is a box called “symbolic/semiotic” which is one level beyond the box called “mammal.” And you’ll see that there is an arrow reaching from the word “human” to a point halfway between those two boxes. What am I trying to say here?

On the one hand, I am saying that humans really are more intelligent than other mammals, in a fundamental qualiattive way. Maybe that isn’t politically correct, but it’s really pretty obvious from a practical, commonsense point of view. Look at what humans have accomplished on this planet, compared with chimpanzees and all the rest. Humans are a classic example of what the great biologist George Gaylord Simpson called a

“quantum breakthrough” in evolution. (I am amused sometimes to see how some of Simpson’s well-known classic ideas are now often attributed to the more recent work of Gould. Gould has built on Simpson, but it did not come out of a vacuum.) Humans are

an example of single species which has spread all over the planet, based on a special advantage. That advantage is a matter of superior intelligence and technology.

However, humans are certainly not the first example of a quantum breakthrough.

It happened before with the first amphibians and reptiles and so on. Maybe it even happened with the first warm-blooded animal to possess a neocortex (though the evidence

Has largely been lost with time). The process of quantum breakthrough does not usually end with that single species straddling the world. In fact, the human species shows all the symptoms of the early stages of a quantum breakthrough. We aren’t the final stage;we have met the missing link, and he is us. We are the transition stage between the mammal brain and the sapient or semiotic level of intelligence.

What is this transition like? How does the base mammal brain work... and how is

the sapient brain different? What is the nature of this qualitative evolutionary change in evolution that we are now in the middle of?

-- The next slides describe my answers to those questions in some detail.

Unfortunately, it would be hard for me to reconstruct what I said at this time. However, the details are in my four papers in the three recent books (1994, 1996 and 1998) edited by Karl Pribram for Erlbaum. At least a couple of those papers were transcribed from talks aimed at a fairly general audience.

Those papers mainly address the basic mammal level of intelligence.

Fortunately, the transcript picks up from the next part of my talk, where I begin to talk about a new issue – my thoughts about the transition from the base “mammal” to “semiotic” level of intelligence. (By the way, I do believe that there are higher levels,

like a quantum level and a “multimodular” level – but that the human brain as such does not incorporate those higher levels.) We do not know if we are actually halfway between the base mammal level and the sapient level, or one-third of the way, and so on; we don’t know exactly what the next evolutionary equilibrium would be like (even assuming that evolution continues to work), because we aren’t there yet and we haven’t worked out that level of the mathematics either. We do know that we aren’t that equilibrium yet.

TAPE 2

This tape begins with my slide referring to the work by James Anderson of Brown university on arithmetic learning. Anderson reports that children learning arithmetic show us that humans seem to have two kinds of learning system. One is a system based on...

Werbos: … image-based thought which is really very optimal. It gets very optimal responses from kids responding to these kinds of relations. And then we have the symbolic reasoning which Jim says is like the alpha-test version of software. It’s clearly not fully developed, and he’s got lots of evidence for that. And that’s really what I was saying.

Now what is this next level, this semiotic level of intelligence? A lot of what Bill Calvin said really fits 100% of my image of what this progression is, from the mammal level to the semiotic level. I used to call it symbolic, but then people reminded me the word ‘semiotic’ means you not only have symbolic reasoning but you pay attention to the meanings of your words. I have to admikt that they are right. My belief is that there really are three or four stages in that evolutionary process. I learned from Bill a couple of months ago that there is one earlier stage that I forgot to include in my diagram that’s critical. I’ve got to read you these charts.

Levin: It’s all just temporal chunking.

Werbos: Well, temporal chunking occurs even in the mouse. But beyond that, there is a neat new experiment with the monkey. Well, I’ll tell you about this experiment.

(Bill told me about this experiment, described by Arbib and Rizzolatti.) Do you mind if I take the time to tell you about this? It’s a great experiment with a monkey. And I will give you a fictitious version of it that’s more fun than what really happened but is pretty close, OK?

Once there was a neuroscientist who was trying to understand the reaching behavior of a monkey going for a raisin. He had gotten lots of recordings from the brain of this monkey. They were the usual kinds of recordings -- at the beginning of the movement, one cell clicks, and at the end, another cell clicks. He has these cells connected to speakers so that he can hear these neurons of the brain. And there is this monkey. The monkey reaches for the raisin, and his final cell goes click, click, click and the scientist says “OK, I’m satisfied.” And he leaves the monkey sitting there in a chair. He sits at his table. He puts the monkey back in the cage and the monkey is still looking at him and he starts writing down in his notes “OK, I’ve got this result, I’ve got this result”, and then he reaches over because he’s feeling a little hungry and he reaches for a raisin. When he reaches for a raisin, suddenly the loudspeaker goes click, click, click, click, click, click. What is going on? The monkey is looking at him pick up the raisin, OK. So suddenly he discovers, “Oh, my God, this center of the brain is not just completion of the reaching movement. The monkey responds to me doing it the same way it responds to itself. It is encoding not only its own actions but my actions in the same brain center.” That’s kind of significant. That makes me think maybe the monkey is the first stage in evolution for it being semiotic. You get the mouse down here, then the monkey, then the early human stages.

The next slide gives my view of the underlying mathematics. This slide was written for the technical guys. T

First, there is a classical kind of neural net adaptation where you have an experience, you’ve got current input, output, reward, and then you change the weights in your brain and then you forget that experience and you go on to the next experience. We call that classical, real-time, weight-based learning.

But in recent years, people have learned that that kind of learning has inherent limits. It creates trade-offs between generalizations and learning speed. If you learn slowly that way, you can learn powerful ideas but you don’t have one-trial learning. (For example, you can’t learn to avoid a hot stove after one experience of being burned by one.) But the methods that have one trial learning can’t generalize. But humans can generalize and do one trial learning, both, so that classical weight-based learning is no good as a model. So recently I have started to sell people on an idea I published back in 1977. It takes a long time and most of the field still hasn’t caught up with this, but a few people have. It’s an idea I called ‘syncretism’ and now some people are calling it ‘memory-based learning’. It’s a very simple idea that as you learn you do one trial learning to remember things, but then you generalize not just from present experience; you relive past experience. You generalize from past experience. And you think subjectively, of course. We adjust our perceptions of reality. We respond to new stimuli, but as we come up with theories to explain new stimuli, we go back to our memory to check those theories against what we experienced before. So we’re constantly consolidating memories, so to speak, by generalizing memory-based learning. That’s critical. So conceptually, we have a memory database and when we generalize, we look at present experience but also our memory database. We use our general model and our past associations and memmories, both, when we decide what we actually expect to happen to us in life.

On this slide, I argue that there is a third, fundamentally different kind of learning that can be called ‘vicarious learning’, where we are training our neural nets against a memory database of our personal experience and of the reconstructed experience of other organisms. This is a fundamental principal. Now I understand there was some discussion of this kind of stuff in the old days by some of the Skinner people. There was some debate about – I forget the name that they used for this – but I would guess that that debate was very confused because these guys were still using intuitive psychology. We’re sitting here looking at mathematics. And in mathematics, it’s crystal clear that when you have this wiring diagram, you’re going to generate something different from what happens if you don’t. If you treat other organisms as just phenomena in nature and you process the information in an objective way, you are never going to have the wiring diagram that lets you learn from their experience the way you learn from your own experience. So my feeling is, my speculation is, that the use of vicarious experience is the transition that begins to move you from the base mammal level to the symbolic or semiotic level. And the way that works… where does vicarious experience begin? Well, it really begins with the monkey and I’m sorry I forgot that stage. (Note: one might speculate that the monkey reconstructs a first-person view of the actions of other monkeys which he sees; the children of hunter-gatherers would reconstruct experience they never saw themselves, conveyed by the dancing of their parents. Pribram tells me that monkeys enjoy video games like Mario Brothers, because they empathize with that other “monkey” on the screen, but rats never get interested. I do not know if anyone has tried the experiment on dogs or ferrets, etc.)

But then after that, I remember some films of the Kalahari bushmen coming home from the hunt, OK. And you could see these guys dancing out the hunt. And you could see people in the tribe entranced, assimilating this, living this experience as if this was their own. They were in an hypnotic state. They were assimilating the experience of the hunt. And my guess is that the little children who experienced the hunt vicariously were probably pretty good hunters even on their first hunt because they learned something that they internalized. This internalizing is really important. So my speculation is that that phenomenon, the bushmen dance, is kind of where a lot of this stuff began. I don’t think it’s throwing rocks, but I admit, humans throw rocks too – that’s part of Bill’s story. (You should see what monkeys throw, and how accurate they can be sometimes.)

I suspect there’s a part of a gating system called NRT (or RTN) connected to the thalamus which may help to explain this, although some people tell me it might be special cells in the thalamus. There may be a very subtle gating mechanism that allows humans to be suggestible, to assimilate experience in ways other mammals don’t. Humans can be hypnotized and other mammals really can’t, not in the same way. But not all humans can be hypnotized, by the way. (See Estabrooks.) So it isn’t clear that this mechanism is operating equally in all humans. It’s clear that this machanism has not reached equilibrium. And it’s clear that these new cellular mechanisms are not all-pervasive. And then after the dance, the next step is that you can get languages like Chinese or Eskimo that are basically word movies. After that, you get into propositional languages like our language.

There are some psychologists who theorize that the structure of our language is built into our genes and into the genes of all human beings equally. And there are a lot of Chinese people who really resent the idea that English grammar is hardwired into their genes. I recently heard a long talk from a prominent Chinese guy on language translation systems in Chinese. (JCIS98 Proceedings.) It is very clear that that language is not at all based on the kind of rules that some people think are hardwired in our genes. For every English sentence there may be 12 or so equivalent Chinese sentences. It’s like a many-to-one relationship. It’s a long story, but the bottom line is that our language is different from theirs, but I can’t believe that the English language is hardwired into our genes. How long has it been around, for God’s sake? The concept of a formal, logical proposition is the foundation of proper English – but that concept is no older than Plato and Aristotle! (Some English language authorities, like the Government Printing Office, appear to have retrogressed to the pre-Platonic level, insofar as they discourage the crucial use of the word “that” as a subordinating conjunctive.) So that’s not in our genes. Maybe it could be in the genes of future humans, but it’s in nobody’s genes yet.

And then I have some notion that maybe honesty (i.e. realistic articulation between the symbolic and the subsymbolic) might be the next stage of evolution. Symbolic reasoning by itself is not a higher order of intelligence unless there is meaning to the symbolic reasoning. And if humans today had a really good correspondence between their subsymbolic thought and their symbolic thought, then maybe we might consider them a semiotic level of intelligence. But we don’t. Humans have not really reached a point where what we think subsymbolically and what we say line up in the reliable systematic way you would want for a true semiotic intelligence – if such a thing is possible. And it may not be. It may be that there isn’t a closed perfect design for semiotic intelligence. But certainly there is something better than where we are today. I mean, clearly we have not closed this evolutionary process, whatever it is. Whether we are moving ahead, I don’t know. But OK, I’ve used up my time. I’m sorry.

Velamoor: Does that something have to do with the simultaneity of the comprehension versus the sequential flow of the description? The ability to think semiotically?

Werbos: Yeah. I guess I feel the same way Jim Anderson does, which is that the genes are probably there to let us do better, but we just haven’t put them together yet. We’re just halfway through a process. We’re not done yet. We’ve just begun the process of being civilized.

Velamoor: What explains the resistance to change in terms of vicarious learning versus the reference base there is and the inability to immediately absorb the new vicarious experience immediately? What’s the component that’s resistant to change that seems to be involved here? Again, in other words, my first reaction to something new that requires me to change is resistance. If I’m learning something vicariously, while I’m absorbing the vicarious learning there’s something in my makeup that resists it as a first impulse.

Werbos: Well, new experience can be difficult to adapt to whether it’s vicarious or even just your own personal experince. In other words, there are personal experiences you have that are hard to adapt to. Part of the problem is simply the fact that generalization takes a lot of effort. Other parts of the problem are more interesting and tricky. It is tricky because there are different personalities and different ways of dealing with a crisis. The psychiatrists have pretty good literature on that. They say that some people engage in denial -- defense mechanisms where they just reject the new information -- and I think that that’s an example of how humans are not yet a semiotic intelligence. A true semiotic intelligence would not simply reject information that way when forming verbal generalizations.

Calvin: The problem is that with new stuff you try to put it into old categories. And it’s your first attempt to try to make variance conform to something you said. Like when you hear a pronunciation of something that’s a bit off the standard, you conform it so that what you hear is the standard, unless it’s just too crazy – in which case you realize that there is something else going on and you should step back and try to make a new category. But this attempt to conform things into little categories is what makes the speed of our language possible. You have to be able to standardize between different speaking, with accents or such.

Whitney: I’ve got one question regarding learning from vicarious experience. Bees. What about honeybees?

Werbos: In fact, I was thinking about that. There are a lot of things I think about but don’t say because there is not enough time. But this guy Bitterman… I actually called him up two years ago to ask how things were going, you know, and how his recovery was from being a heretic. And he said “Well, you know, I actually don’t have total contempt for these invertebrate guys because there are two species of invertebrate you should pay attention to.”. And one was the honeybee and one was the octopus. And he said that honeybees are actually capable of doing things that only mammals seem to be able to do in the vertebrate kingdom, although there are other things they can’t do. It’s an interesting question. At a recent conference I asked a lot of people about these invertebrates, but they told me a sad story. They said that in the invertebrate brain there are basically two parts… or, rather, the arthropod brain. There are these things called ‘mushroom bodies’ where the cells are so small you can’t record from them, and then there is the other stuff that’s not where the higher intelligence is. So it may be harder to figure out intelligence in the honeybee than in the vertebrates. I don’t know.

Calvin: No, that’s the standard problem with these.

Werbos: Now the octopus is another story. And I bet a lot of those people haven’t done work with octopus in the last 10 years. I don’t know, but it really seemed to go out of fashion.

Whitney: Maybe the octopus had giant nerve endings.

Werbos: So that’s why it went out of fashion. And maybe the octupi were too smart to deal with. There are stories of people getting squirted by angry octopus and stuff like that. In fact, they all seem to have octopus stories.

Calvin: A lot of people consider them to be almost as smart as rats. It’s a really good example out of the invertebrates for that kind of intelligence.

Levin: One thing I want to pick up is something he said about being skeptical about the English language being hardwired. I don’t think there is anyone claiming that. There is no specificity for English grammar being hardwired. It is amazing that, though, that when you have two different languages meet and children are brought up speaking in these strange amalgamations, these pidgins of English, say, definite forms emerge, definite patterns emerge. So it seems that there’s something hard wired.

Calvin: Actually, I just wrote a book about this. No, the Chomskian claim is that there is not the grammar of any specific language, but there is in effect a universal human language of which there are hundreds of variations and examples. The tendency is to create an ability to structure language – that is, to say phrases and clauses that you can nest together. And the tendency to learn to do that out of the examples that surround you – from whatever particular language you are surrounded by – comes from the universals. It’s not the specific English or Chinese grammar; these are just instances of a universal tendency. The Chomskian claim is that there must be something fairly specific and hardwired about it because there are an infinite number of grammars that you could make, but the human languages tend to fall into a few certain groups. The system is so organized that there must be some sort of standard.

Werbos: The Chomsky guys and the Skinner guys in the end are not totally monolithic, thank God, so one has to be careful about saying “Are the Chomsky guys right or wrong?”, because there are quite a lot of different Chomsky guys. But there really are people one runs across who are looking for the past participle gene. And it’s kind of sad. Sorry, I’ve seen people in the American Psychological Association…

Levin: Do you have their names?

Werbos: Naming names would be cruel and unusual punishment. But there are specific people doing that kind of stuff. So one has to be very careful to define what one means when one says there is something inborn. If you take the cautious position of saying that there is a deep structure underneath the symbols, which is a deep structure of meaning… Well, if you use that formulation I would say that yes, of course there is a deep structure, and I would say the mammal brain is that deep structure. So that this structure made up of decision blocks which are like verbs and objects which are like nouns… yeah, that’s the deep structure that’s in there. (Note: “decision blocks” are part of my view of the mammal brain, which I discussed in the part which was not recorded.) The mammal brain in a sense is the deep structure. And hierarchical decision structures create hierarchical procedures – sure, that’s there. On the other hand, if you start saying that all sentences must be grammatical and must be propositions, then I think you’re in trouble because there are languages which in a sense are not grammatical. They don’t follow the kind of formal Western notions of admissibility and well-formed text. So, it’s a question of how you formulate the Chomsky ideas. The bottom line is that you could be right or wrong and call yourself a Chomsky-ite; it depends on how you interpret the Chomsky ideas. Are you a fundamentalist Chomsky-ite? Or are you a revisionist Chomsky-ite? I’ve seen people of both types. Again, I would feel uncomfortable naming names because, you know, we have enough personality problems in this world without naming names.

Levin: Just to follow that up… I would be curious about a language where there would be no such thing as intuitively well-formed not intuitively well-formed. And the business of propositions… You could argue that the propositional character of sentences is an artifact of translation, translating into English. In English there is subject-predicate. It may well be that we impose that by translational procedures, but I’d be very surprised to find a language that didn’t use propositions. And maybe that doesn’t mean anything, because it’s an artifact, but I’d be very surprised if it weren’t true.

Werbos: Well, one might even argue that we should all have the experience of what Chinese or Eskimo are really like. I sometimes even think we ought to teach pidgin English in school, because it’s easy to build a translation machine that goes in an automatic way from Chinese to pidgin English, but to do it to well formed English is well nigh impossible. It should take about three months, I think, for students to start to know how to read pidgin English and understand what the implied structure of thought is. It’s hard for me right now – and maybe I’m just getting tired after talking for a long time – to convey what those languages feel like. But the notion of a word movie, I think, conveys what a lot of languages are like. A lot of them are like word movies, like sequences of images and each word is like an image. Of course there are sequences of images which are realistic and there are sequences of images which are crazy. That’s certainly true. But it’s not as if there were a grammar. I recently spoke to people of the Chinese Academy of Science, the high level Chinese government group, which was assigned to evaluate the possibility of translation between Chinese and English. What they told me, knowing Chinese much better than I do, was just flat out – this Western notion of grammar really is not applicable to the way we do things. Yes, you can make noise in Chinese that’s crazy and you can say things that sound coherent, but the distinction is not based on Western grammar. It’s something else.

Calvin: Now the universals and this sort of come back to intelligence itself. Universals tend to have to do with the common associations of words, and verbs are a good example. Any time you mention a verb, like ‘give’ or ‘bring’ or something, there are standard associations with certain classes. That is to say, either ‘bring’ or ‘give’, for example, has three nouns that have to go with it. Somebody speaks a sentence that says ‘Give him.’ and they don’t tell you what to give and you think that they have spoken an ungrammatical sentence. They have not fulfilled your expectations. And the search for things that will fit the appropriate category of object of ‘give’ and beneficiary or whatever… these are standard features. In Chinese, you know, the verb ‘give’ has the same set of expectations. Other verbs, say like ‘sleep’ can’t have objects in translation. And again, that’s the same thing in Chinese as in the other languages. They all have the system of common associations that allow you to guess what the speaker intended. You may not be able to do it perfectly, and the luxury of common speech consists of fragments of sentences. When you transcribe you discover this. Yet they communicate just fine because the recipient can guess what you mean.

Werbos: I agree that the phenomenon of meaning – that a sequence of words has a meaning – that phenomenon is universal. But how you relate the sequence of symbols to meaning… There are many different ways those things can go. Even in spoken language.

Calvin: Written language has to be much more specific than spoken language because you don’t have all the redundancies of voice and direct eye contact and all the things that characterize spoken language. So when you write something down you’ve got to dot the ‘i’s, cross the ‘t’s in ways that you don’t have to with language in general. Don’t confuse written language problems…

Werbos: Written Chinese certainly admits a level of ambiguity that you would not tolerate in an English class in the United States. You are allowed to say things in formal Chinese which an American English teacher would simply not accept. The Chinese would say that the American English teacher is a restrictive fascist for putting unnecessary boundaries on the human spirit. I’m sure the American English teacher would have another answer to that.

Calvin: Communication is always done in context. The context of Chinese is allowed to be broader.

Citron: Paul, I have a question. Based on your last 10 years of being involved in models of human learning, what do you think about the next 10 or 20 years of development in this field due to our technology development and the ability to do things that we couldn’t do in the past, that we can’t even do now but we may be able to do 5 years from now? How will that evolve over the next couple decades?

Werbos: It’s funny… John asked me the same question over breakfast. I tend not to look at it in a deterministic way, but in terms of opportunity. It clearly depends upon the level of affluence in a community and the amount of resources. My opinion is that with the correct level of focus and effort, in 10 years we could build the functional equivalent – at least in software – of an artificial base mammal brain. I think we could. But it would take a lot of effort. I think I now see the basic way of doing it. But there are something like 12 subtopical areas. We need subsystems to do X, and others to do Y, and others to do Z – that sort of thing. We know what the principles are for each of these 12 problems. We know what the relevant disciplines are. We know how to formulate the questions. It takes creative people but it doesn’t take Einsteins. With the right interdisciplinary teams working in parallel we could do it in 10 years. Now, even looking from the viewpoint of NSF which can fund some of these, I have to say it looks very chancy. The p;roblem is that you’ve got to find people from the relevant disciplines to come together, cooperate, do something new, different from what they’ve been doing. And so we could do it in 10 years, but whether we will do it in 10 years is another matter.

Kistler: Would this be in actual hardware, in transistors, or would this be just the schematics?

Werbos: Well, what I said was that in 10 years it would be basically in software. I have proposed an initiative to do some of that work. And in parallel…

Calvin: A virtual bird, a virtual …

Werbos: …a program that runs on a supercomputer. But, on the other hand, the artificial mammal project, the artificial mouse... I even suggested to some of my management that we call it the artificial mammal or the artificial mouse, but, you know, engineers don’t tend to go for that sometimes – at least not American engineers. In Japan or China they would probably love that, but here they don’t go for that. Actually, a Russian friend of mine wanted to call it the artificial dog project, but... Well, the point is I don’t think we’re ready for that.

Citron: What about the long-range future of humanity?

Werbos: You are tempting me to tell an old story. Do you like old war stories? It’s an old war story. I was attending a meeting right here in Seattle. I think it was ’91 or ’90. There were a couple thousand people here. A guy named Bob Marks at the University of Washington had some involvement in this committee. And when we all showed up, the Boeing corporation said “Well, we’re going to take some of the top neural net people and invite them to our VIP dining room and pump their brains to learn what we can about neural nets and how we can use them on airplanes.” I was one of the folks invited over there. We were sitting in the Boeing dining room and one of the Boeing guys said, “You know, there are so many of these new technologies that pop up. Someone ought to do research to help us predict which ones are going to be flash-in-the-pan fads and which ones are going to be real.” and I said “Well, gee… NSF has funded research related to that. We’ve got an emerging technologies initiative. And I didn’t do that work, but I heard the story that one of the funny predictors was science fiction. If there was good science fiction about a technology that depicted it as helping humanity, the technology tended to flourish. And if the science fiction depicted it as hurting humanity, somehow the technology mysteriously withered on the vine. It is a very interesting predictor.”

I mentioned that, and then the guy from Boeing groaned, “Oh, my God.”. I asked “Why are you groaning?”. “Talk about bad science fiction. Have you seen Terminator 2?” And I said “No,” and he said “Well, you have to. After what you just said, you have to. It’s bad science fiction about neural nets.”. And then somebody said that there’s Data on Star Trek and it’s not all bad so there is still hope. He said “But you have to watch Terminator 2.”. OK, I decided I would.

In the meantime I talked to some people at the conference about how you use neural nets in SDI-type applications and I talked to a few other people and came back to NSF. Tuesday I signed a grant to set up some new company. And then I saw this movie Terminator 2 and it totally freaked me out because ... it starts out with the whole human race about to go extinct because some yoyo told the SDI people how to use really intelligent neural networks. There was a robot arm and a chip… The initial contract I had just funded was to develop a robot arm. There was a scientist I had recruited to work with them. He looked like the guy in the movie. OK. The vice president of the company, morphed with me, looked like the bad guy who’s coming to destroy Schwarzenegger. And then I thought “Well, geez, they couldn’t have known this. This only happened today.”, but then in the movie she’s getting precognitive visions of what’s going to happen in the future. (Note: to protect people’s privacy, I am deleting some of the more uncanny parallels from this transcript.) And so, for a day I was kind of shaky and I said “Let’s be real. We’re developing this mathematics, but to develop the hardware, the chip to compete with the human brain, I mean, that’s… What are we talking about? 10¹⁸? Where are we today? 10¹²? I shouldn’t worry.”. There was no chip that could do this and the company had a robot arm. So I don’t have to worry. OK.

Two days later, I received a phone call.... “I’m calling from .... We have developed a chip. We are working with ...” Oh, my God. “And we want to brief you on this.” So they brief me on it. They were doing the work for ... (Not SDI!) It was real. They felt they could get to ... I said “My God, who knows this?” and they said “Well, we haven’t told ... this because you know how it works when you try to get money ... You give them a nice target and if you meet it you get more money. Now we know we can reach ... but we don’t want to tell them that. We just plan to do it because that way we’re safe because we know we’re going to get our money...”. And so they went to the schematics (and I was really freaked) and I went to my friends ... and said “Those guys are really great; here’s another project you could have them work on that’s a different project, one that would be a little more useful in the near term. And I know this sounds like near-term thinking, but you know, you need to get immediate product out of these people. Please, why don’t we reallocate the effort because I’m not sure we need a Terminator 2. I mean, we can do plenty with what we already have to improve our ... Who needs something that can outsmart humans?”. And I thought that would be the end of it. And then... well... the story is not yet over, even today. We do have to be really careful here with these kinds of technologies.