Optimization: A Foundation for Understanding Consciousness



Paul J. Werbos[1]

8411 48th Avenue, College Park, Maryland, USA 20740





            This chapter describes how the concept of optimization — whatever its limitations — can be a useful tool in serious efforts to understand consciousness and the mind.  Such efforts must draw on what has been learned in many disciplines, many cultures, and many centuries.  Neural net designs based on optimization do add something new and critical here: they offer us a more complete understanding of the phenomenon of intelligence and mind, precise enough to be replicated on electronic computers, yet fully consistent with what we see in the brain and in experiments on overt behavior.  A deeper understanding of intelligence and the mind has immediate implications for the problem of consciousness, and for the foundations of psychology and philosophy.

            This chapter provides a global summary of these implications, as seen from the viewpoint of existentialism, Confucianism, and linguistic analysis — established philosophical traditions which should not be ignored here.  Among the six issues discussed are the subjective sense of existence, the levels of intelligence, the foundations of ethics, alternative states of consciousness, concepts of the soul, and the role of quantum theory.  In all cases, the chapter presents candid personal views which may be regarded as heresies by a significant fraction of the community.  The chapter argues that neural network research can indeed yield important insights into all of these questions, but that it does not provide a basis for overthrowing earlier views in philosophy or for resolving the debate about the existence of the soul; instead, it may help us to understand, unify, sharpen and deepen some very ancient insights. It suggests how one might understand and reinterpret some ancient four-letter words -- hope, fear and soul -- which have permeated human cultures for millennia, long before the advent of formal philosophy or theology.




            The title of this section is partly a pun, and partly an appeal to common sense.  The word "intelligent" by definition has something to do with the ability to do one's best, to optimize.  There is a huge literature out there — both in economics and in other social sciences — on humans' ability to foul up, to be irrational, to make mistakes, and to become totally insane; however, it is important that this literature mainly focuses on deviations from the default, reference assumption of perfect optimality.  The behavior it describes may be viewed as examples of stupidity (i.e., failures of intelligence) rather than examples of intelligence.  We as humans do not really intend to foul up (vis-à-vis our real values) or to waste energy fighting our­selves.  Optimality still provides a very powerful intellectual tool, which we can use to create very powerful designs and models, even if these models must be modified later to account for second-order phenomena.

            Years ago, for purposes of mathematical research, I proposed that we should actually define an intelligent system as a “generalized system which takes action to maximize some measure of success over the long-term in an uncertain environment which it must learn how to adapt to in an open-minded way.” (Werbos (1986) went on to define the terms within this definition.) This paper will  try to describe the relation between this more precise concept of intelligence and the fuzzier concepts which have emerged simply by observing human beings. The key concept here is that an intelligent system does not start out with an optimal strategy of action; instead, it tries to learn an optimal strategy, bit by bit, over time.

            If you are an intelligent human being, and you can think of ways that other people around you could be a little bit smarter in achieving their goals, then the deviations from optimality may seem very important to you.  You are comparing one intelligent system (yourself) against another.  But if you are an engineer, trying to build systems which work as well as possible, you will find it truly amazing that human brains of any description perform as well as they do on such a wide variety of very difficult tasks.  If you do the very best you can, as an engineer, to develop an optimizing learning system, you will still find imperfec­tions or limitations in what you develop.  In fact, it is fascinating how the imperfec­tions of the best possible engineering designs do seem to match the most obvious imperfections of organic brains.

            As an example, consider the local minimum problem.  In realistic terms, it is not possible to build a powerful learning system which can never get caught in a rut, in a vicious cycle or local minimum.  Therefore, it should be no surprise at all that people and animals do get caught in ruts, even though their brains do have well-tuned mechanisms to try to minimize the problem.  Present-generation artificial neural networks (ANNs) do not get caught in local minima nearly as much as some people feared ten years ago; however, when ANNs are used to solve very complex control problems, it is crucial to use a strategy called "shaping" to avoid terrible local minima.  In "shaping" (White & Sofge, 1992), one first trains the ANN to learn a very simple version of the problem'; one then uses the resulting weights as initial values for an ANN trained to solve a harder version of the problem; and so on.  This parallels the human need to learn "one step at a time."

            Please note that learning one step at a time is not the same thing  as performing a defined task one step at a time. A single step or stage in the learning process often represents an entire new strategy or concept of how to perform a complex task, a task which may not even be divisible into a sequence of subtasks. For example, in engineering, consider the problem of training a system to balance three connected poles, one on top of the other, like a family of acrobats trying to stand on top of each other without falling over. The first step may be to learn how to balance a simple pole. The second step might be to balance a large pole with a smaller pole on top. The third stage might be to balance two poles of the same size. Four stages of learning may or may not be good enough to solve this training problem. This step-by-step approach can work only if each individual step is easy enough to be learned, but hard enough to force the development of the new concepts (or “hidden variables” or “representation”) needed to solving the next more difficult problem. In formal terms, one is always guaranteed to overcome the local minimum problem if one somehow can learn to develop the concepts necessary to the task at hand.

            Thanks to the learned use of symbolic reasoning, human beings do not get caught in a rut nearly as much as other species. (A human being who lived exactly like a chimpanzee would generally be considered as being caught in a rut.) We can understand this situation better by viewing it in a more positive way: humans often use symbolic reasoning to help them visualize new, creative opportunities to enhance their lives, in ways that would not be obvious if they always just followed the path of least resistance. Symbolic reasoning can help us learn and develop new concepts in a more systematic way, based on learned strategies of thought. Even so, the human use of symbolic reasoning has some serious limitations, to be described in section 5.  Even the most sane among us are still caught in local minima, in lives that fall short of our ultimate potential, to some degree (Campbell, 1971; Levine, 1994).

            Another example of alleged imperfection is the tendency of humans to seek novelty or new information, even at some cost in terms of reinforcement.  In actuality, novelty-seeking or explor­atory behavior turns out to be an essential component of optimizing neural network control systems.  It is essential both for stability and for avoiding local minima (Miller, Sutton, & Werbos, 1990; White & Sofge, 1992).  In other words, it is essential to our ability to find ever more intelligent and more creative ways of coping with reality.  Dreams, heresies, humor and new challenges are all crucial aspects of human exploratory behavior.

            Certainly, the concept of optimization has been abused very often.  Many of us have gone through phases of excessive self-control during adolescence, in alternation with periods of excessive exploratory behavior.  In management research, it is now well known that "reinforce­ment" strategies which are based on demeaning, distorted assumptions about human values can reduce productivity substantially.  (There is an old adage that productiv­ity is lowest in organizations where people are motivated by fear, mediocre in organizations ruled by greed, and highest in organiza­tions driven by pride or self-respect.)  In large organizations of all kinds, managers who try to micro-optimize — assuming that they know everything, and assuming that there is no need for exploratory behavior — often degrade productivity.  All of these behaviors are motivated by an honest desire to optimize, but they are in fact grossly suboptimal; they are transitional stages, the kinds of stages or ruts which learning systems get stuck in for awhile as they gradually learn better.  They may learn better either through creative thinking or through bankruptcy.  (Admittedly, however, the phenomenon of intelli­gence is far less obvious in social systems than in individual human minds.)

            In the field of psychology, Stephen Grossberg has argued very often that models based solely on reinforcement learning or optimization can only explain about half of the experi­ments out there.  To explain the other half, one must account for "classical conditioning," which requires a subsystem to generate expectations about the environment.  In fact, the more advanced ANN designs which I have developed (Santiago & Werbos, 1994; White & Sofge, 1992) do contain such an expectations system, because that is crucial to effective optimization in complex engineering applications. Prior to late 1993, there had been no serious, published tests of these particular designs on realistic control challenges, in part because there were simpler versions which were easier to implement; however, by late 1994, five groups of researchers had implemented these designs, and shown that they do lead to better results across a variety of applications -- difficult benchmark problems in bioreactor control, robot arm control, and automatic aircraft landing; simulated missile interception, compared against current state-of-the-art methods used on that problem; and control of a physical prototype of a hypersonic aircraft.(Werbos 1995a). In this paper, however, I will not review the mathematics of these models in detail, because they are moderately complex and have appeared elsewhere.

            A complete review of the literature on rationality and learning would require far more detail than I have provided here.  For example, it should consider Raiffa (1986), Von Neumann and Morgenstern (1953); Werbos (1968, 1992b) and the work of Herbert Simon and others.  The goal of this chapter, however, is not to evaluate the concept of optimality, but rather to use the concept in addressing larger questions about consciousness and the mind, and so on.


                             2. INTRODUCTION: THE ISSUE OF CONSCIOUSNESS


            In 1992, a prominent speaker at the annual conference of the European Neural Network Society declared "an open season on the problem of consciousness."  The "problem of consciousness" is a very old problem, and one may legitimately ask why we would suddenly spend so much energy in revisiting it at this time.  There are at least two legitimate answers to that question: (1) that fundamentally new insights, developed by the neural network community in interdisciplinary research, let us address the problem of consciousness at a higher level; (2) that a relaxation of certain academic taboos — restricting analysis to overt behavior only (as in classical behaviorism) or to linguistic analysis only (as in some university philosophy departments in the US and UK) may now permit us to face up to issues which it was hard to address ten or twenty years ago.  These answers lead, however, to further questions: (1) If insights from neural network research are so useful, then why are so many of the new manifestoes on conscious­ness written by people with limited knowledge of the real frontiers of the field (i.e., of those aspects which are most relevant to higher intelligence?); (2) Where is there serious philosophical depth in this discussion, above and beyond the classical Anglo-American approach?; (3) Just what is the problem of conscious­ness anyway?

            This chapter will draw heavily on current neural network research, as one might expect, but it will also draw on traditions like existentialism and Confucianism, which have critical contribu­tions to make.  I do not have enough space here to explain all the vicissitudes and varieties of existentialism or Confucianism; however, these traditions are very important as an antidote to some of the more extreme and parochial approaches to philosophy which have existed in the past in some American universities.  Twenty years ago, the leading theory of ethics in the Anglo-American philosophy departments was a theory attributed to Rawls which proceeded entirely by performing a semantic analysis of the word "justice" and of what it should mean (based on assorted assumptions about what good definitions for a word should be), building up to strong recommendations for what policy makers should do all across the board. (Bear in mind that the problem of ethics refers to the problem of purpose and goals in human life; it requires a lot more than just coming up with a formula to keep lawyers happy.) This episode reminds me of a meeting I once attended at the Census Bureau, where famous world-class statisticians proposed to develop a measure of value or utility, for use in allocat­ing federal funds, by simply doing a factor analysis of a complete set of available data series collected by the Bureau.  This situation would have been very amusing, except that billions of dollars of federal funds have actually been allocated on the basis of formulas derived in such ways.  See Werbos (1994a) for a discussion of assorted ways that value measurements have been developed in the government.

            Nevertheless, I would agree with the Anglo-American school on at least two basic points: (1) that it is foolish to invest too much energy in worrying about words like "con­sciousness" until we develop some sort of clear idea of what it is that we are worrying about, an idea of what the word is supposed to mean; (2) that language, in general, does play a deep and central role in philosophy (Werbos, 1992c).

            So what, then, is the "problem of consciousness"? This paper will not start out by picking out one particular definition of the word “consciousness;” this would be a misleading exercise, because the word really does have many different meanings. Instead, it will focus 6 more specific questions that people appear to be asking under this general rubric:

            1.  How is it possible -- objectively -- that human beings could ever meet the dictionary defini­tion of "consciousness" -- a basic sense of awareness, which allows them to respond to what they are aware of?

            2.  How is it possible that human beings have a subjective feeling that we do in fact exist, given that we have the various capabilities discussed under questions 1 and 3?

            3.  How is it possible that human beings show additional capabilities, such as intelligence or emotions or creativity, which we commonly tend to associate with our consciousness?

            4.  What is it in the brain that distinguishes between states of "consciousness" versus states of "unconsciousness" like sleep?

            5.  Can the human mind -- in its widest scope -- be explained entirely in terms of atoms and neurons, or do we need to invoke some sort of "soul" to explain the full range of our experience?

            6.  Can the human mind or the “soul” be fully explained in terms of algorithms or Turing-machine concepts (generalized to include continuous variables), or must we invoke other concepts like quantum computing (Penrose, 1989)?            

            This chapter will present my personal opinions on these questions.  The reader should be reassured that I am aware of the idiosyncratic nature of my views, and that my strategic goals in the neural network field (Werbos, 1993a, 1994b, 1994c, 1994d) are sufficiently explicit that they leave no room at all for me to entertain any kind of bias against anyone who can advance those goals, regardless of their views on these questions.  Because of page limits, this chapter will simply explain what my views are, and cite other papers which explain the critical details.


                                  3. THE OBJECTIVE QUESTION OF AWARENESS


            Question number one is hardly a problem at all, from an objective point of view — even though it is probably the most semantically correct interpretation of the "problem of conscious­ness." Not only human beings, but all animals on earth show some degree of awareness of their environment.  Awareness -- in a literal, objective interpretation of the word -- simply refers to the ability of organisms to input and respond to data from the environment.  There is no great mystery in explaining why that phenomenon should evolve (i.e., can confer an advantage in surviv­al), and no great mystery in seeing that there are neural circuits capable of providing that simple capability.

            Many of the neuroscientists working on "consciousness" would say that they are studying consciousness in the sense of awareness.  They study how people become "conscious of a stimulus." (For example, members of that community speaking at the 1994 World Congress on Neural Networks, whose works are cited in Alavi and Taylor, 1994, and Taylor, 1992, made this statement.) That research does not try to explain how awareness exists, in a general sense; rather, it attempts to uncover the specific mechanisms by which information attracts attention and is registered at various levels of the sensory system of the brain.  To fully understand these mechanisms — or to understand any other subsystem of the brain — it is crucial to understand how the subsystems contribute to the functioning of the whole system; consciousness in that sense is very much a subset of consciousness as intelligence, to be discussed in section 5.

            It is very unfortunate, in my view, that work on sensory input pathways — however important — has been mixed up with discussions of the existence of the soul, based solely on confusions between different definitions of the word "consciousness."  Leaping from sensory physiology directly to assertions about the soul is analogous to jumping from the physics of silicon to assertions about computer design, without bothering to learn about chip design or transistors (let alone applications) along the way.  In fact, the latter extrapolation makes more sense than the former, because silicon is at least a dominant aspect of chips, while sensory input is only one aspect of human intelligence.

            Another common fallacy in the neuroscience of consciousness is the search for the site of  “consciousness” within the cerebral cortex. This is analogous to the famous “search for the engram,” back in the days before we understood that human memory is more distributed -- even “holographic” -- in nature. Sensory inputs typically get registered at many sites, at many levels in the brain. Each of these sites represents a certain level of “awareness” -- a level of responsiveness to stimuli. Some biologists have been very excited to learn that human subjects state that they are aware of stimuli which reach certain sites, and state that they are unaware of certain others; however, from an objective point of view, this does not imply that one site is magically “conscious,” while others are not. It only tells us that information in one site is available as an input (direct or indirect) to those areas of the cortex which control the verbal behavior of asserting “I am aware of that stimulus.”

            It should be emphasized that neither of these fallacies is universal within the neuroscience of consciousness. However, there are many cases where these fallacies have received greater publicity than the valid, underlying science.


                                      4. THE SUBJECTIVE SENSE OF EXISTENCE


            From a very strict existentialist point of view, it is nonsense to try to "explain" our own subjective sense of existence.  Our subjective sense of existence or awareness is our starting point, the foundation on which we build everything else.  This question is analogous to a question which novices ask of physi­cists: "Dr. Einstein, can you explain why R=T in general relativi­ty? What underlying phenomena give rise to that equation? What kind of ether do electromagnetic waves travel in?"  The point is that Einstein was looking for the lowest level of physical description, that level which inherently cannot be explained as the working out of something more fundamental.  Both Einstein and the existential­ists were very active in questioning and revising their views of what exists at the most fundamental level, but they still main­tained an effort to build everything else up from that level.

            From an objective point of view, we may twist the question around, and ask how it is that organisms could evolve a sense of their own existence as such.  Marvin Minsky answered this years ago, by simply pointing out that there are evolutionary advantages in organisms developing models of the self and insights to describe their own thinking.  Once again, there is no real problem here from an objective point of view.  When we ask whether other human beings have a sense of their own existence, we are essentially just asking the objective question; the answer is obviously "yes."  (It would still be "yes" even if other humans were actually just programs in a vast virtual reality game, so long as those programs demonstrated the pertinent objective capabilities.)  From an objective point of view, one may go further and argue that sane, self-aware organisms will naturally tend to accept the existentialist view of taking their own existence and awareness as a starting point, because this is an honest reflection of how their natural thought-processes work. (See section 5.)

            From a strict Anglo-American point of view, neither of these answers is entirely satisfactory, because they seem to assume that there really do exist organisms on earth, that there is such a thing as biological evolution, etc.  If we limit our thinking to nothing but the manipulation of words, without ever grounding ourselves in any sort of direct perception of reality, then we can in principle permit any fantastic combination of words to emerge from our mouths.  From such a viewpoint, we could just as well worry deeply about issues like why the sun appears to rise every day; after all, can we be really sure that the earth revolves about the sun?  Even if we accept that there is always some distant degree of uncertainty here (as is appropriate, from an existential­ist point of view), it would seem silly to invest a huge amount of emotional energy on quirky little hypothetical contingencies which are poorly integrated into the rest of our concerns and which we have no way to account for in any case.

            I do not believe that all American philosophers adhere to the extreme viewpoint I am arguing against here; in fact, I will not spend any further time on that particular species of philosophy here.  Also, I do not mean to downplay the issue of how we know that the sun is likely to rise tomorrow; studying that issue is quite different from actually worrying about what to do (or how to answer intellectual questions) in case the sun actually does not rise to tomorrow.  See section of White and Sofge (1992) for a discussion of how old questions, like the question of the sun rising tomorrow, do in fact get assimilated into more far-ranging theory in the neural network field. They do have a serious link to the hard-core scientific work to be summarized very briefly in the following section.




            In most of my research, I have found it preferable to address the issue of "intelli­gence," rather than the issue of "conscious­ness," because it expresses more exactly where the hard-core scientific issues really lie.  My view of intelligence is itself somewhat controversial, and some psychologists would argue that it is far too narrow; however, even my view requires us to include both emotions and creativity as attributes of intelligence. This is one case where neural net theory does indeed have something to say about conven­tional views of the mind: contrary to popular wisdom, as expressed in Star Trek etc., intelligent androids and the like cannot be devoid of emotional systems, because emotional systems are a necessary component of intelligent systems (Werbos, 1992a, 1992c). There are excellent reasons to expect this conclusion to apply even with fuzzier, less specialized views of “intelligence.”

            In my own research, I have defined an "intelligent system" as a system capable of maximizing some kind of measurement of utility or reinforcement or performance or goal-satisfaction (with or without prior knowledge of how that measure is defined as a function of other variables) over time, in an environment whose dynamics are not known in advance, so that the system must learn both the dynamics and a strategy of action in real time through experience.  It must be a generalized system, capable of adapting to "any" noisy, nonlinear environment, if given enough time to adapt. (See White & Sofge, 1992, chapter 10 for more precise concepts to replace the word "any.")  This definition implicitly includes the ability to solve complex problems which, in turn, implies some degree of creativity. Neural net designs now exist, on paper, which appear fully capable of meeting this definition (Werbos, 1992c; White & Sofge, 1992),though there are a few points where the approach is clear but the details have yet to be worked out (Werbos, 1993a, 1994d).  Some psychologists would complain that human beings are not totally rational or optimal; however, realistic neural net designs have imperfections which are similar in many ways to those of humans.

            Why are “emotions” necessary as part of such an intelligent system? The technical arguments are given in more detail in the sources cited above. Crudely speaking, any “intelligent” system -- by my definition or any other -- should at least have some ability to learn how to take actions at the present time which lead to better outcomes (by some criterion) in the future. It should have some degree of “foresight.” “Foresight” also turns out to be essential even to stability in conventional control systems like chemical plants controllers trying to maintain operation at a fixed set-point (Werbos 1995b). There are really only two ways to achieve “foresight” in the general case, where we can’t cheat by exploiting linearity or the like: (1) by building explicit plans for what we will do and what will happen, extending all the way into the distant future as far as we care about; (2) by developing an evaluation system, or “Critic,” which can be used to predict the long-term benefit of the various near-term alternative outcomes of alternative actions. (One can, of course, combine both planning and a Critic.) Whenever it is not possible to plan the future exactly -- because of uncertainties or variables beyond one’s control -- then an adaptive Critic becomes essential.

            When there are many, many variables to be considered (as in human decision-making), then it is not enough to have one large evaluation system which produces a global evaluation of the entire state of everything in one’s environment. It is important to have individual evaluations, analogous to prices, for each of the important variables or objects in one’s environment. This idea -- the idea of calculating a positive or negative evaluation for each object -- corresponds exactly to Freud’s notion of “emotional charge.” (It also relates to the ancient idea of “hopes” or “fears” attached to individual objects or variables. Hope and fear refer specifically to the “emotional” reactions -- positive or negative weights -- placed on different variables, based on their implications for the future success of the organism. The words “good” and “bad” also express such assessments by the organism.) Backpropagation itself originated in 1974 as a surprisingly direct translation of Freud's concept of "emotional energy" or "psychic energy" into mathematics; those concepts are also the basis of the most powerful neurocontrol systems in engineering applications today (Werbos, 1994c, 1995a).  Grossberg (1982) has argued that an emotional system is needed even to replicate the simplest kinds of memory capabilities found in the human brain.  Levine and Leven (1992) have also discussed the importance of emotional systems at some length.

            Classical views of intelligence have often assumed that intelligence is either a binary variable (either you have it or you don't) or a continuous variable (everything from microbes to superhumans has a certain degree of it).  A careful examination of the real-time optimiza­tion designs now available (White & Sofge, 1992) suggests, instead, that intelligence is more like a quantized or discrete variable.  (Continuous variables like brain size and metabolic level also have some significance, contrary to what is politically correct; if they were irrelevant, evolution would have settled on a zero-cost zero-weight brain.) For example, even with simple supervised learning networks — which probably exist as local circuits in the brain (Werbos, 1994b) — there are fundamental, qualitative differences between different types of design: local designs based on fixed preprocessors, feedforward designs with adaptable hidden units, and simulta­neous-recurrent networks adapted by simultaneous Backpropagation. These different types of design yield distinct quantum levels of capability in approximating functions (Werbos, 1993a).

            At a more global level, Bitterman (1965) demonstrated years ago that there are basic, qualitative differences between intelli­gence in different classes of vertebrates, as seen in experiments on behavior.  He also showed that these differences have definite links to the qualitative differences in the gross cellular architecture between brains from different classes of vertebrates.  These differences, in turn, can be related to clear-cut differences which exist between different levels of design in artificial neural networks; for example the "error critic" design in White and Sofge (1992, chapter 13) requires something like a merger of limbic (critic) cortex and general (neuroidentification) cortex, which does in fact underlie the historical evolution of neocortex in the mammal, whose removal (according to Bitterman) generates the removal of processing capabilities which happen to be related to error critics.  To an engineer, it is astonishing that anyone would have simply assumed qualitatively equivalent behavior from well-designed systems with radically different components and structures; however, behaviorist dogma historically made it very difficult to study these basic realities.  (A cynic might argue that the behaviorists were trying to defend themselves against the charge that experiments with animals might not tell us directly about humans.  Another explanation is that behaviorists were trying to save the world from the dangers of racism — including racism against snails and microbes.) The requirement for an emotional system applies, however, even to the simplest level of intelligence within vertebrates; all vertebrate brains do possess a limbic system.

            What would it take to achieve a quantum level of intelligence which can truly adapt to "any" environment, up to the full potential of the universal Turing machine? In Werbos (1992b) and White and Sofge (1992, chapter 13), I argued that full Turing machine capabili­ties require the use of explicit symbolic reason­ing.  The naive next step is to conclude that human beings — who seem capable of symbolic reasoning by use of words or mathematics — represent a quantum step in the evolution of intelligence, above other mammals.  From the viewpoint of everyday experience, this would seem highly probable, at first.

            On the other hand, formal symbolic reasoning is a surprisingly recent phenomenon. It is easy enough for humans to utter words, but the conscious manipulation of words or equations by use of formal symbolic logic and related techniques is relatively new. In fact, the articulation of experience into formal logical propositions or equations is also new. Without such articulation, symbolic reasoning as such has little value. Of equal importance are those forms of “visualization” which translate back from formal symbols into presymbolic “images.” The general development of symbolic reasoning over the past few millennia has been charted in some detail by Sapir (in comparative linguistics) and by Max Weber (in comparative sociology). For ideological reasons, Max Weber has become quite popular in recent years, and Sapir has not, but the history they summarize remains quite serious.

            In the neural network field, Jim Anderson (e.g., Anderson, Spoehr, & Bennett, 1994) has done extensive modeling and analysis of how humans learn arithmetic. Based on his empirical findings, he has argued that humans possess "two" learning mechanisms: (1) a highly devel­oped and fine-tuned "sensory" system, shared with other mammals; (2) a "buggy alpha test version" of formal symbolic reasoning.  After all, if symbolic reasoning is the foundation of human technology and civilization, how do we explain the fact that human technology and civilization is only a few thousand years old? The obvious answer (elaborated on in Werbos, 1992c) is that humans represent a recent, unstable transitional life-form, which has only recently evolved just enough capability for symbolic reasoning to let it muddle through a few technological design problems, on a one-in-a-million basis (which is still enough to start a technolog­ical civilization, when there is a culture available to disseminate new ideas, as has been observed even in chimpanzees).  We ourselves are the "missing link" between the mammalian and the symbolic levels of intelligence.  Perhaps there will never be such a thing as a fully perfected symbolic reasoner, but it is clear that humans have not exhausted whatever potential does exist.  These ideas may be seen as an explanation for related observations by Lorenz, as discussed by Levine in this book.

            One might then pose the problem of consciousness as follows: Are human beings really "conscious" or "intelligent"?  Perhaps not, in the larger scheme of things.

            In Werbos (1992c), I explain how simple wiring changes, related to the balance between the waking state and the dreaming state, might be central to human abilities in symbolic reasoning.  (These, in turn, might be related to the unique wiring of the human thalamic reticular nucleus as discussed by John Taylor (Alavi & Taylor, 1994; Taylor, 1992). If so, there is little doubt that such capabilities could be wired into a computer as well.  Computers could be made "conscious" or "intelligent" at a level beyond that of human brains today, if we were crazy and suicidal enough to want to do this.

             In my view, the biggest single symptom of our lack of evolution is our inability to master the most fundamental aspects of symbolic reasoning: the ability to accurately articulate our true goals and values, in a way which is totally in harmony with the presymbolic aspects of our thought, and allows us to master symbols instead of being mastered by them.  In crude language, the problem is that we lie to ourselves.  (In psychiatrists' terms, we overuse denial as a defense mechanism.)  We lack the ability to simply articulate — in a direct, honest way — the information coming to us from all of our feelings and our everyday experience of life.  My examples of Anglo-American philosophers and statisti­cians, in Section 2, are not isolated examples.  To perform reasoning effectively, humans must learn even the most basic things the hard way, like dogs learning to walk on two feet.  It is natural for humans to learn symbolic reasoning, when they have enough time and help and intelligence, but the process can be very difficult.  The basic foundation of Confucian ethics — to learn to know oneself, and to be "true" to oneself — may be viewed as a remarkably clear expression of (and aid to) that learning process.  In this view, the mark of a sane human being is an attitude towards life which includes a kind of total openness to the empirical data which comes to us from our senses and from our emotionally-charged feelings, and an easy two-way communication and harmony between the symbolic and nonsymbolic aspects of our intelligence.  This is very close, of course, to the Freudian ideal of "sanity."

            From a more formalistic point of view, Confucian ethics may be justified as follows.  As Bertrand Russell pointed out long ago, there can be no logical, operational answer to questions like "What should we do with our lives?" because the word "should" does not have any operational, objective content.  However, there can be an operational answer to the question: “What would I do if I were wise? What 'answers' to the problems of ethics would satisfy me — put me in a state of stable mental equilibrium in respect to my acceptance of these 'answers' — if I fully understood myself, my feelings, and my environment?" These questions are inherently meaningful and operational because they address the I, the self, which can be understood — in part because of neural network research (Werbos, 1992c).  Using these questions as the foundations of ethics leads one directly to the pursuit of "integrity," as defined by Confu­cius.

As a practical matter, one can never expect to achieve a complete and perfect understanding of one’s environment and oneself, any more than one can expect to play a perfect game of chess; however, this does not invalidate the effort.

            This section should not be interpreted as an endorsement of all the secondary ideas which have evolved in Confucianism over the years.  Confucianism — like Christianity, Marxism, Islam, Buddhism, and Western science — has accumulated its share of obnoxious barnacles, due to the universal existence of power-seekers, opportunists masquerading as zealots, gullible followers, and groupthink.




            There is a radical difference between the concept of con­sciousness as "wakefulness" and the concept of consciousness as "intelligence."

            Neural network theory already provides some insight into the reasons why intelligent organisms must have multiple states of consciousness.  For example, in Werbos (1987), and White and Sofge (1992), I argue that some form of "dreaming" or "simulation" is essential to the efficient adaptation (or effective foresight) of advanced reinforcement learning systems.  After Sutton and I had long discussions of that paper (cited by Sutton) at GTE in 1987, Sutton actually performed simulations (described in Miller, Sutton, and Werbos, 1990) demonstrating this point empirically.  This interpretation of dreaming is basically equivalent to the theory developed independently by LaBerge (see LaBerge & Rheingold, 1990), who is arguably the leading dream researcher in the world today.

            As noted in the previous section, I have also suggested how an intermediate stage of consciousness, linked to hypnosis (Werbos, 1992c), may be important to human abilities with language.  Deep sleep (and its sub-states?) remain a mystery, but there are new possibilities for linking that phenomenon to neural network research (Werbos, 1993a).  More research is needed here, especially to pin down the link between neural net models and brain circuits, but there is good reason to expect success in this work, if sufficient effort is applied.


                                                    7. WHAT ABOUT THE SOUL?


            Up to this point, I might hope that any truly rational scientist, reviewing the evidence carefully, would at least respect the views I have expressed.  From this point on, I have no such illusions.

            Sections 5 and 6 argued that everything that  people associate most passionately with human consciousness — intelligence, emotions, creativity, dreams, and so on — can be fully understood in terms of classical neural network models, consistent with the Turing theory of computation. Werbos (1994b) gives an overview of how these new models fit with specific circuits in the brain as well..  By Occam's Razor, this suggests that the hypothesis of a "soul" is totally unnecessary and should be abandoned.  This is clearly a highly rational conclusion to draw, and I remember believing in this conclusion very intensely back at ages 8 through 19.  However, on a purely personal basis, I have come around to the view that something like a "soul" — a part of the mind and the self which cannot be reduced to atoms and neurons — is in fact necessary in order to explain the full range of human experience. Like Shaw(), I am concerned with dimensions of experience more subtle than those which are usually cited in these discussions, and my use of the word “soul” is not intended in any way as a reference to theology (as will be discussed).

            Based on past experience, I would predict that most readers will feel a fair amount of surprise at seeing the last two sentences in print.  A good number of readers — including some very creative and prominent people — will quietly voice agreement, but will wonder where we go from here.  A few canny old psychia­trists may even snigger: "So someone else has discovered that you need Jung as well as Freud to come to terms with the full spectrum of human experience.  So what else is new?"  A few psychologists will immediately leave the room, for fear that the physicists will denounce them as practitioners of voodoo and steal all their federal funding if they are seen consorting with people who express such views.  (These fears are not entirely based on fantasy, either.) A very few readers will actually feel honest, subjective uncertainty about the issue, and really seek evidence for and against.  (That was my stance in 1969-71, the period when I really first developed Backpropagation, ADAC and other backpropa­gation-based critic designs, though I only published Werbos, 1968, then.)  A fair number of very articulate readers — including many powerful administrators — will instantly think about two question: (1) Has an eccentric lunatic just walked into the room? Is this another Eccles (1993)?;(2) If we make room for the discussion of the soul hypothesis on an equal footing with the "standard" alternative, do we risk losing the insights we get from neural net research and unleashing forces of sheer craziness and illogical thinking which could overwhelm us?

            There is no way that a chapter this brief could seriously resolve the concerns of all these groups.  However, I would like to make some comments regarding the last two concerns.

            Back in 1964, when I first read Hebb's ideas about these issues, I found myself in complete agreement with his views.  Hebb was trying to explain the idea of Occam's Razor, which we now understand more precisely (White & Sofge, 1992, chapter 10).  He described how prior expectations — which encourage us not to invoke "expensive" assumptions which complicate our underlying understanding of the universe — are important in science, above and beyond empirical data as such.  As an example, he pointed towards the laboratory work in parapsy­chology.  He argued that most scientists would probably agree with the conclusions of that work, if they judged the statistics as they do with most scientific papers they read.  However, because those conclusions have a huge improbability "cost" a priori, we would still tend to disbelieve them, if we take a balanced look at prior and empirical information.  Based on section 5, I would take this a step further: I would argue, even now, that all of the laboratory data we now have regarding human abilities, from problem-solving through to parapsycholo­gy, is still not convincing enough to justify the soul hypothesis.

            In fairness to the parapsychologists, I should confess that I do not know their literature well enough to draw strong conclusions.  There is an analogy here between parapsychology and the study of ancient history: it requires reliance on a huge body of secondary sources, many of them quite willing to stretch the truth in favor of diverse biases (some in favor and some against), so that it would take a huge effort to make a truly judicious analysis.  Even if one did all that work, one should recall the example of Aristotle, who produced a wonderfully judicious resolution of the scientific issues of the time; judicious or not, it was dead wrong.  Thus even if the results from parapsychology were very clear-cut, the average scientist could not afford to know enough to find a compelling reason to believe them.

            Given this situation, how could I — or any other scientist, thinking for himself or herself — give any credence at all to the soul hypothesis?

            In my own case, the answer lies in direct, personal observa­tion of what I see around me. I do not expect all rational scientists to agree with me, because they do not share the same base of experience.  But I do not accept the idea that I myself, in formulating my own views, must discard any personal experience which has not been socialized through the laboratory.  I like to believe that my interest in the human mind, and my acceptance of the existentialist/Confucian viewpoint back in 1964, was the real cause of my making these observations — which I did not allow myself to accept for several years.

            Just how strange and eccentric is it to be open to the soul hypothesis based on personal experience? Years ago, the National Science Foundation commissioned a major study of the underlying values of the American people, through the National Opinion Research Center (NORC) at the University of Chicago, a leading center of excellence in surveys and sociology and the like.  One of the difficult issues they addressed was the nature of beliefs and experience related to the soul hypothesis.  They discovered that personal experiences played a far greater role than they had expected beforehand.  Even more surprising, they found that the percentage of people claiming such experience increased monotonically with education and other measures of success.  The investiga­tors have reported (Greeley & McCready, 1975) the great surprise they encountered when they presented this finding to their review board.  A skeptic on the board pointed out that their statistical results would predict that 70% of that very board (composed of PhDs) would have answered "yes" to a highly inflammatory-looking question.  After this, 70% of the board did in fact come forward, reluctantly, and validate the prediction — to the great surprise of everyone in the room.  My own views of the soul hypothesis and the relevant experience are considerably more complex and idiosyncratic than what was reported in Greeley and McCready (1975), but the bottom line is still this: whether I am a lunatic or not, I am certainly not a very eccentric one (except perhaps in my willing­ness to articulate taboo ideas, when my session chair asks me to address a controversial issue).  There are many serious, technical people who take the soul hypothesis seriously, and they merit equal time on this issue.

            Would these statistics be different for people who — in addition to being well-trained — are highly independent, creative thinkers, the kind of people who have demonstrat­ed more than anyone else their ability to ignore conventional wisdom (of both parents and peers) and arrive at their own viewpoint? It is interesting to go back and consider the four greatest physicists of this century, the four pioneers who rebuilt the very foundations of modern physics — Einstein, Schrodinger, Heisenberg and DeBroglie.  Einstein often used the word "God," and is often alleged to have been a mystic; however, in what I have seen of his writings, I see no reason to believe that this was anything more than the erudite but firmly "secular" theology I have seen very often, expressed in similar ways, at the local Unitarian church.  On the other hand, records of the conversations between Schrodinger and Einstein make it very clear that Schrodinger was deeply interested in things like Sufi mysticism - something which is far more than mere allegory.  Heisenberg consistently described his physics in Vedantic terms, and invited well-known yogis to expound their views at the Copenhagen Institute.  DeBroglie is said to have been a follower of Bergson's vision of collective intelligence, which would appear to be a close relative of Teilhard de Chardin's views.  All in all, the 70% figure would seem to be in the ballpark here.

            Would the soul hypothesis per se undermine the effort to understand the mind in a scientific way? On the contrary, one might argue that efforts to totally repress this idea (or to hand it over as a monopoly to television preachers) would be as conducive to sanity as any other kind of gross repression of thought.

            The greatest abuse of the soul hypothesis has come from power seekers who try to use it as an excuse for making other people follow their orders in a blind, unthinking manner, without opening themselves to personal experience, to mathematical or scientific efforts to understand that experience, and so on.  The formulation I am proposing here would still start out from the Confu­cian/existentialist point of view; that view clearly argues that we should try to be true to our entire self — including both the brain and the soul.  If neural network mathematics is useful in under­standing the general phenomenon of intelligence — regardless of the hardware that implements this intelligence — then it should, in principle, be useful even in explaining other forms of intelli­gence.  So far as I can tell, in my own experience, this does appear to be the case.  The Appendix to this chapter will describe some of my personal thoughts on this point, for those who take the hypothesis seriously. Section 8 will explain why I use this ancient four-letter word “soul,” despite the unfortunate associations it conjures up in the minds of some readers.


                                    8. QUANTUM COMPUTING, MIND AND SOUL


            Quantum computing is a serious and exciting new area for research. However, like the neuroscience of consciousness, it has spawned massive confusion, both in the public and in the scientific community, in part because it combines two complex research areas -- quantum field theory (QFT) and advanced computing. Even within the scientific community, there are relatively few people who truly understand the basics of both of these areas.

            This section will argue that there is a serious, realistic possibility that quantum computing might produce generic, useful computational capabilities, and that related capabilities might even exist in the “soul” (if the soul exists) but probably not in the brain. However, it will suggest that these capabilities could only become intelligible after we reorient this research in new directions. Before explaining these points, I must first review some basic facts which are well understood already by the relevant specialists.

            Some people imagine that a valid understanding of computation in the brain must make reference to quantum theory because, after all, electrons and protons and so on are governed by quantum theory. But one could apply the same logic to computer chips as well; they too are made of electrons, protons and neutrons. In actuality, quantum theory is used routinely and extensively by the people who design fundamental electronic devices like transistors and gates; the literature on electronics is already quite full of concepts like quantum wells, tunneling junctions, band gaps, Bohm-Aharanov rings, and so on. But all of this is at the device level. One uses quantum theory, for example, to design a device which performs a task like the logical “AND” operation. Then, when combining low-level devices together to make a useful computer system, one relies mainly on classical, digital logic or (as in artificial neural networks) on simple analog concepts which are also quite classical. Penrose (1989) does a reasonably accurate job of describing the kind of logic that we use when we build up systems from devices. Our new designs in the neural network field have many advantages in terms of cost and throughput, but they still fit into this general framework.

            In formal terms, all of the computer systems in use today -- from personal computers through to biologically-inspired holographic systems -- can be understood as “Turing machines.” They fit into a universal theory of computing systems developed decades ago by Alan Turing.

            Quantum computing is a novel effort to design computer systems which exploit fundamental effects in QFT which cannot be reduced to Turing machines. Early work in this field was inspired by suggestions from Richard Feynman, one of the co-inventors of QFT. An excellent survey has been published by David Deutsch (1992) of Cambridge University, one of the leading researchers in this area. Deutsch has developed a new universal theory of computing, analogous to Turing’s, but expanded to incorporate quantum effects. Deutsch and other workers in this field have indeed demonstrated that quantum effects can be used to perform tasks which cannot be performed nearly as well by Turing machines. Nevertheless, the tasks described so far appear more like curiosities, rather than the basis of any truly generic technology. Deutsch expresses serious doubt whether any of this will ever have practical significance to any form of generic computing technology; however, he hopes that it is too early to tell. This literature provides no basis at present for believing that quantum effects are important in any way to the phenomenon of intelligence.

            Within the fields of psychology and neural networks, many researchers have suggested that field effects or even three-dimensional Schrodinger equations could be important to intelligent systems (Pribram 1991, Werbos 1993b). But the computational mechanisms proposed in that literature are not examples of quantum computing as defined above. They are fully within the range of what can be simulated (albeit inefficiently) on conventional digital computers. They are fully within the range of what can be implemented efficiently in the kind of hardware used for artificial neural networks.

            Hameroff and his collaborators (Pribram 1994) have recently proposed that coherence effects like those used in lasers might produce true quantum computing effects within the microtubules of cells. There are excellent computational reasons to predict that the microtubules do play a crucial role in “intelligence” in the brain (Werbos, 1992a, 1994b); however, this does not require quantum computing effects. For Hameroff’s coherence effects to work, Penrose has calculated that they would somehow have to involve correlations across 10,000 neurons or more. There is no indication of what new computational capabilities such a correlation would lead to, and no indication whatsoever that such effects would have anything to do with what we see happening at that level in the brain. It is not entirely obvious that laser-like activity could be possible in assemblies of neurons.

            All of these negative conclusions and loose ends appear very discouraging at first. However, they are really quite typical of any research field in its early stages. The neural network field went through a similar period of discouragement, between the publication of Minsky’s book on perceptrons and the work which led to the popularization of backpropagation (Werbos 1994c). Fifteen years ago, the most serious, well-informed analysis of fuel cells in transportation appeared quite negative; however, new approaches and breakthroughs have made this the lead candidate for the automobile of the future, and the subject of a major joint initiative between the President of the US and the automotive industry. There is a legitimate basis for hoping that new approaches might work as well in the field of quantum computing.

            Conventional approaches to quantum computing are inspired mainly by the Copenhagen or the many-worlds interpretations of QFT, and by conventional digital, sequential computing. But there are other interpretations of QFT in existence. Regardless of which interpretation is actually true, in an objective sense, they are all close enough that they give some valid intuition about the phenomena themselves.  One interpretation which I have developed (Werbos, 1994e) is the idea that quantum effects can be explained by assuming that causality runs forwards and backwards, symmetrically, in quantum experiments.  Thus, when people use special crystals to demonstrate basic quantum effects, there is a kind of settling down through a resonance between past and future -- like a Hopfield net or a simultaneous-recurrent net (Werbos, 1992a), but without the need to wait for convergence through iteration in forwards time.  Even if the human brain has no such capabilities, I can imagine a possibil­ity (with 20% probability?) that this could be used to increase the power of optical neural networks.  It is questionable that humanity would benefit much from such technology, but the intellectual issue is worth resolving.

            Because Penrose has generated some strong visceral reactions amongst physicists, I need to make a few side comments here, for the physicist, before continuing.  In my alternative interpretation of quantum theory, I am not hypothesizing that "quantum causality" (as Schwinger would define it) is violated; rather, I am merely highlighting the well-known fact that ordinary time-forwards causality — causality as defined in the original Bell-Shimony work — is violated by standard Quantum Electrodynamics (QED).  (In my papers, for example, I cite well-known work by Von Neumann and DeBeauregard on this point.)  I am not assuming any deviations from QED in this argument.  My alternative interpretation is relevant here only as a way of getting intuition about QED.  Likewise, I am not talking about a kind of computing which would require astronom­ical energies; ordinary Bell's Theorem experiments have been conducted at very ordinary levels of energy, using the same kinds of photorefractive crystals that people use in optical implementa­tions of ANNs. As this book goes to press, both Elizabeth Behrman of Wichita State University and John Caulfield of Alabama A&M University have claimed serious progress in developing ideas and designs of this sort, involving realistic optical computing hardware.

            One reviewer — a non-physicist — has asked for a simple example of backwards causality in quantum physics.  The simplest example I know was discussed in my 1974 paper on quantum founda­tions (cited in Werbos, 1993c), based on the account of nuclear exchange reactions in Segre's book Nuclei and Particles.  Suppose that you could design a cannon which, without any electronic control system could generate the following capability: whenever an enemy rocket is about to come up over the horizon, it will automatically swivel into exactly the right angle, and fire at the exact time, so that it will hit the target exactly when the target first appears over the horizon, even if the target is fired after the cannon must fire to meet it.  If anyone ever built such a cannon, one might attribute it to magic or precognition, or suspect over-the-horizon radar and cheating.  But neutrons, shooting pi mesons out to oncoming protons, have displayed exactly such a "precognition." The conver­sion of the oncoming proton to a neutron proves that charged mesons are exchanged.  More relevant, but complicated, examples (involving optics and Bell's Theorem) are cited in Werbos (1993c).  Behavior like this may sound mysterious, but it is fully consistent with the model of a universe governed entirely by partial differential equations.

            Taking this further, some of my friends have suggested that quantum effects and holographic processing could possibly explain the aspects of experience which I attribute to “soul.”  As an example, one of these friends has cited the work on remote viewing of H.E. Puthoff and Russell Targ at SRI International in the 1980’s, funded by the Department of Defense. Unfortunately, I do not have easy access to that work, and I do not have strong feelings about its validity. However, the concept of remote viewing does exemplify the kind of phenomenon which -- if true -- would present an interesting challenge to physics and psychology. It is easier to discuss than the more complex phenomena which I find more interesting.

            Quantum effects and holographic effects by themselves could not begin to explain something like remote viewing. The kinds of mechanisms which we observe in the brain -- the mechanisms which drive the creation of chemical bonds, the flux of electromagnetic fields, and the movement of currents -- are based entirely on quantum electrodynamics (QED), an aspect of QFT which is well understood in phenomenological terms. QED fully incorporates quantum effects, and it underlies all forms of holography now known to the human species. It is not a deep, dark mystery. If quantum and holographic effects were enough to give us a capability to see a picture of a remote location far away, based on a receiving device as small as a human brain on the surface of the earth, then the scientists in the military -- who are very familiar with QED -- would have built such a device long ago. The military have spent billions of dollars, across many research labs and universities, trying to improve the resolution of their imaging of distant objects, using devices much larger than a brain, exploiting all kinds of interference effects at all kinds of frequencies in the electromagnetic spectrum. On occasion, highly creative physicists like Schwinger and Hagelstein have demonstrated that coherence effects can accomplish things which more pedestrian experimentalists had thought impossible; however, these things fall far short of remote viewing ala Puthoff and Targ.

            Based on this work, we may be reasonably sure that “remote viewing” would require one or more of: (1) a highly complex signal processing system and “antenna”; (2) some kind of explicit cabling system or network to connect remote sites; (3) additional physical fields beyond those covered by QED. Even in biological signal processing systems, such as sonar processing in the bat, it is clear that a large and visible chunk of the brain is necessary in order to perform signal processing for something much less complex than remote viewing. All of this suggests that we really need to face up to a stark, binary decision here: either to reject the proposed class of phenomena altogether, or to consider the possibility of information processing structure (like invisible networks or invisible signal processing or “intelligence” in the universe itself) beyond what we can see in the atoms of the brain. It is rational to feel uncertain (i.e., to assign probabilities) between these two alternatives, but it is not rational to imagine that one can avoid the choice itself through some kind of fuzzy logic. As noted in section 7, the “soul” alternative would have a high apriori improbability cost; however, it need not be a whole lot worse than the assumption of unseen “dark matter” amongst astronomers, if one considers the amazing variety of biological systems on earth adapted to exploit diverse sources of energy. Still, as discussed in section 7, there are good reasons to respect those scientists who consider the improbability cost too high to consider, based on their present experience.

            The argument above does not suggest that quantum effects, holography or complex vibrational states in large molecules are unimportant to biological intelligence. It merely suggests that they would not be enough by themselves to explain phenomena like remote viewing. It reinforces the conclusion from earlier paragraphs that there is little if any indication of true quantum computing in the brain itself even if we should postulate effects like remote viewing. However, once we postulate such effects, we can begin to imagine the possibility of yet another level of intelligence, beyond the level of single-stream symbolic reasoning, based on effects such as time-symmetric causality or the processing of multiple streams of symbols in parallel. Such possibilities are extremely speculative, of course, at the present time.




            The editor of this book has asked me to say something more specific about my views on the nature of the soul, and its relation to other themes in this book.  This request is eminently reason­able; however, my thoughts on this point should not be considered part of the chapter proper, because they are inextricably linked to idiosyncratic aspects of personal observations and experience.  In the absence of shared experience and lengthier, more complete explanations, I would not expect a rational reader to agree with the details of my views.  I would ask the classical materialist simply to skip this appendix; it is, at best, a "what if" piece, asking what we might conclude after we agree that the soul does exist.

            My own base of experience is perhaps a bit closer to what I read about in Jung (see Campbell, 1971) than to the kind of experience described by Greeley and McCready (1975), though I can relate to both to some degree.  Greeley and McCready state that the experiences they refer to are not limited to any religious or ethnic group, but that most educated people tend to become much more involved in their own particular religious heritage and more committed to its beliefs after undergoing such experience.  I find this somewhat disappoint­ing, and perhaps a further bit of evidence that we are still very much a transitional species.  To the extent that there is some common experience out there, logic suggests that it should push us towards more common conclusions, rather than push us into greater provincialism and sectarianism.  It is one thing to appreciate the living culture and past experience of one's provincial heritage; it is a totally different thing to endorse florid theories, of bureaucratic rather than empirical origin (like the Government Printing Office Style Manual or lists of prison sentences in purgatory), without paying full attention to the global heritage of humanity as a whole.

            After one accepts that the soul exists, one's prior probabili­ties (per Hebb's argument) change substantially.  One naturally does try to learn from the experience of others, as well as oneself.  In anthropology, the example of penicillin is very famous: penicillin (in bread mold) was used in healing for many, many years by African witch doctors, but totally ignored by scientists because they did not like the explanations used by the witch doctors; knowing about that example, we may try to learn what we can from the experience of many cultures, without letting ourselves be put off by our disrespect for their explanations of their experience.  Of course, we must be careful to account for what we know about the ways in which rumor and wishful thinking tend to distort experience in predictable sorts of ways (especially when they tend to deify people in power).

            After having explored more of these cultures and people than I could summarize here, I feel increasingly confident that there is no one on earth who has a legitimate basis for describing the nature of the soul in any real detail.  The exploration has been worthwhile for other reasons, and there are important insights to be found in obscure cultures, but none of these people even begins to approach the level of qualitative understanding we would want to demand, as scientists.  In understanding the soul, we are like people in the tenth century, interested in astronomy; there is some important information available to us, but if we demand a full understanding in our lifetimes, we will only set ourselves up to become the victims of other people's fantasies.  A rational, honest, intelligent human being would have to take the kind of approach described by Raiffa (1968) in decision analysis: i.e., to accept uncertainty as an unavoidable fact of life, and to live with it as best we can.  Like the tenth century astronomers, we may still choose to work hard to grow in understanding, but to do this effectively we must admit the limitations we face.  We need to play these issues by ear, to maintain a certain degree of balance and detachment, to rely heavily on direct observation (which we constantly try to enhance further), and to maintain a variety of alternative working hypotheses.

            In examining historical ideas about the soul, I am amazed at the florid details of religious mythologies which contradict each other and are rather easy to explain away in psychoanalytic terms as creations of the mind (Campbell, 1971).  On the other hand, it is hard at times to avoid some degree of respect for the extreme Buddhist viewpoint that everything we see can be explained away as a creation of the mind, including the walls and the floor; however, such feelings can be explained away as a consequence of our present ignorance, and are comparable to the pessimism of certain neurosci­entists regarding our understanding of the brain (Werbos, 1994c, p. 2).  Still, the existence of an alternative explanation does not disprove the concept.

            As a humorous aside, I can imagine someone arguing that everything we see is a product of Mind, and that Mind in turn is governed by backpropagation — ergo that backpropagation is the foundation of everything.  Even as the inventor of backpropagation, however, I would find that idea a bit too much.

            If we find that florid mythologies are not satisfying (and are too large a set to select from in any case), then — in our effort to do better than chaotic, pure phenomenology — our best hope is to use some of the same ideas we use in science, including Occam's Razor.  In fact, even the mystics have used expressions like "As above, so below," and expounded the idea of monism — the idea that the soul and the body are governed by the same set of natural laws, laws which are no less precise and universal for being unknown in the present age of ignorance.  Even the New Testament is full of references to things that can only be "revealed" or understood in a future age when humanity is ready, as a result of learning over time.  Is it not possible that mathematics is a crucial part of what is necessary for such understanding, and part of what we have really been learning in the past two millennia?

            From this perspective, then, can we imagine how a universe governed by some kinds of mathematical laws that we can conceive of — either from differential equation theory or information process­ing theory — could generate such a phenomenon as "soul"? As someone who knows about information processing more than I know about differential equations, I still find it hard to imagine information processing as a foundation to explain everything.  The problem is that all forms of Mind that we are familiar with (and can conceive of) inherently require something outside themselves to relate to (Jung in Campbell, 1971).  Finkelstein (1985) and others have looked for reformulations of quantum theory in terms of Mind based on "quantum neural networks;" however, it is my understanding that such efforts have not gotten very far. If we cannot yet conceive of a universe governed by information processing concepts, then we are left with the alternative of partial differential equations, an approach which has been studied at length by physicists such as Einstein.

            Any differential-equation-based Cosmos would presumably be governed by some sort of thermodynamic principles, like those we experience here which generate Darwinian selection, or a general­ization to account for causality forwards and backwards in time. (See Werbos 1994e for a discussion of the complex relations between these various concepts).  Until recently, my thoughts were based more on the former.

            In a Darwinian Cosmos, one might think of the soul as a kind of living organism, based on fields and forces as yet not under­stood, living in symbiosis with the other part of us.  (I am reminded of the Star Trek episode where Dr. Crusher points towards a "ghost" and says something like: "You.. you are not really a spirit...  I now know what you really are, you dirty cheater... you are nothing but a life form.”  But this "ghost" was not the only one of us guilty of being a life form.  (It's better than being a death form, I suppose.) The traditional alchemical marriage (Campbell, 1971) can be seen simply as an effort to get both parts working in harmony, in a unified way, in recognition of the fact that this is the only way to get a Pareto optimal result for both parts.  When storing information, however, one would normally prefer to store it in more permanent hardware.  (Some mystical traditions have argued that all humans routinely exercise capabilities far beyond what they consciously believe in — but that people have difficulties in putting enough learning or experience into their souls to permit the easy memory or control of such faculties.  I am reminded of Hebb's (1949) comment  that more brain space and learning time are needed when learning to cope with larger volumes of sensory input.)  The quality of symbiosis might depend both on actions initiated on the soul side and on the normal genetically-determined capabilities of the nervous system; but who knows?                       

            Based on these ideas, there are two kinds of symbiosis one might imagine — a one-to-one symbiosis, or a many-to-one symbiosis.  The latter would match a wide variety of traditional mystical beliefs, ranging from Jung's collective unconscious through to Teilhard de Chardin (1972) or the Gaia hypothesis (Lovelock, 1992).  If we postulated such a collective intelligence or soul, then I would predict that our experience of the soul would be analogous to the experience of a single neuron (or cell assembly) inside of a higher-order neural network; for example, we may be whipsawed by backpropagation effects at times, or we may find ourselves acting as powerful channels of psychic energy (backpropagation), especial­ly when we crystallize concepts which can help the entire global system to escape from local minima, and to grow in maturity.  In either model of symbiosis — one-to-one or many-to-one — I would expect that issues related to psychological growth and ego formation, as described by Freud and clarified by neural network models, would apply in a similar way both to the soul and to the brain.

            More recently, I find myself influenced by images which emerge from Werbos (1994e), which come closer to the older ideas of a much larger web of life, in which people may vary in their degree of immersion in the more local collective intelligence.

            There has been a lot of interest lately in the Gaia hypothesis (Lovelock, 1992), which has been used, for example, as a rationale for environmentalism of the spirit (Gore, 1992). (There have been many interesting treatments of this idea in science fiction as well, including -- among others -- some of the works of Orson Scott Card, Silverberg and Chalker.) All of this fits well with my own thoughts, but lately I feel there is something fundamentally incomplete in that image.  Recently, I find myself more attracted to the old Chinese image, which pictures humanity more as a middle kingdom, poised between earth and sky — demanding a balance between these two strong spiritual connections or parts of our lives.

            Some readers may feel that I have left out some very crucial things in this very brief account.  I agree, very strongly.  A few of the holes are filled in (albeit still very briefly) in Werbos (1986, 1992c, 1993b, 1994e).

            As a practical matter, I do not spend a lot of time thinking about these concepts, however great their putative importance, because I recognize how great our ignorance really is; however, there is no doubt that they substantially colorize my perception of human events, and I like to believe that they do at least represent some improvement over the traditional extremes of florid, fearful ethnocentric mythologies and cold, grey, blind materialism, both of which substantially inhibit the natural human tendency towards spiritual growth.  




Alavi, F., & Taylor, J. G. (1994).  Computer simulation of con­scious sensory experiences.  World Congress on Neural Net­works, San Diego (Vol. 1, pp. 209-214).  Hillsdale, NJ: Lawrence Erlbaum Associates.

Anderson, J. A., Spoehr, K. T., & Bennett, D. J. (1994).  A study in numerical perversity: Teaching arithmetic to a neural network.  In D. S. Levine & M. Aparicio, IV (Eds.), Neural Networks for Knowledge Representation and Inference (pp. 311-335).  Hills­dale, NJ: Lawrence Erlbaum Associates.

Bitterman, M. E. (1965).  Comparative analysis of learning, Science, 188, 699-709, 1975.  See also The evolution of intelligence.  Scientific American, January 1965.

Campbell, J. (Ed.) (1971).  The Portable Jung.  New York: Viking.

Deutsch, D. (1992).  Quan­tum computing.  Physics World, June 1992.

Eccles, J. C. (1993).  Evolution of complexity of the brain with the emergence of conscious­ness.  In K. H. Pribram (Ed.), Rethinking Neural Networks: Quantum Fields and Biological Data.  Hillsdale, NJ: Lawrence Erlbaum Associates.

Finkelstein, D. (1985).  Superconducting causal nets.  Internation­al Journal of Theoretical Physics, Vol. 27, No.4, 1985.

Gore, A. (1992).  Earth in the Balance: Ecology and the Human Spirit.  Boston: Houghton-Mifflin.

Greeley, A. M., & McCready, W. C. (1975).  Are we a nation of mystics?  New York Times Magazine, Jan. 26, 1975.

Grossberg, S. (1982).  A psychophysiological theory of reinforce­ment, drive, motivation, and attention.  Journal of Theoreti­cal Neurobiology, 1, 286-369, 1982.

Hebb, D. O. (1949).  The Organization of Behavior.  New York: Wiley.

LaBerge, S., & Rheingold (1990). Exploring the World of Lucid Dreaming.  Ballantine.

Levine, D. S. (1994).  Steps toward a neural theory of self-actualization.  World Congress on Neural Networks, San Diego (Vol. I, pp. 215-220).  Hillsdale, NJ: Lawrence Erlbaum Associates.

Levine, D. S., & Leven, S. J. (Eds.) (1992).  Motivation, Emotion, and Goal Direction in Neural Networks.  Hillsdale, NJ: Lawrence Erlbaum Associates.

Lovelock, J. E. (1992).  The Gaia hypothesis.  In L. Margulis and L. Olendzenski (Eds.), Environmental Evolution.  Cambridge, MA: MIT Press.

Miller, W., Sutton, R., & Werbos, P. J. (Eds.) (1990).  Neural Networks for Control.  Cambridge, MA: MIT Press.

Penrose, R. (1989).  The Emperor's New Mind: Concerning Computers, Minds and the Laws of Physics.   Oxford, UK: Oxford University Press.

Pribram, K.H. (1991) Brain and Perception. Hillsdale NJ: Lawrence Erlbaum Associates.

Pribram, K. H. (Ed.) (1994),  Origins..., Proceedings of the Second Appalachian Conference, papers on coherent photons.  Hills­dale, NJ: Lawrence Erlbaum Associates.

Raiffa, H. (1968).  Decision Analysis: Introductory Lectures on Making Choices Under Uncertainty.  Reading, MA: Addison-Wesley.

Santiago, R., and Werbos, P. J.  (1994).  New progress toward truly brain-like intelligent control.  World Congress on Neural Networks, San Diego (Vol. 1, pp. 27-33).  Hillsdale, NJ: Lawrence Erlbaum Associates.

Shaw, B. (1965) Back to Methusaleh. In The Complete Plays of Bernard Shaw. London: Paul Hamlyn.

Taylor, J. G. (1992).  From single neuron to cognition.  In I. Aleksander & J.Taylor (Eds.), Artificial Neural Networks 2 (ICANN92 Proceedings).  Amsterdam: North Holland.

Teilhard de Chardin, P. (1972).  Activation of Energy.  New York: Harcourt-Brace.

Von Neumann, J., & Morgenstern, O. (1953).  The Theory of Games and Economic Behavior.  Princeton, NJ: Princeton University Press.

Werbos, P. J. (1968).  Elements of intelligence.  Cybernetics (Namur), No. 3.

Werbos, P. J. (1986).  Generalized information requirements of intelligent decision-making systems, SUGI 11 Proceedings, SAS Institute, Cary, NC, 1986.  A version updated in later 1986 is available from the author.

Werbos, P. J. (1987).  Building and understanding adaptive systems: a statistical/numerical approach to factory automation and brain research.  IEEE Transactions on Systems, Man, and Cybernetics, Jan/Feb, 1987.

Werbos, P. J. (1992a). The cytoskeleton: Why it may be crucial to human learning and to neurocontrol.  Nanobiology, Vol. 1, No. 1.

Werbos, P. J. (1992b).  Neurocontrol: where it is going and why it is crucial, in I.  Aleksan­der and J. Taylor (Eds.), Artificial Neural Networks II , North Holland, 1992.  Updated versions are reprinted in Werbos (1994c) and in M. Gupta, ed, IEEE Handbook of Intelligent Control, forthcoming.

Werbos, P. J. (1992c).  Neural networks and the human mind: new mathematics fits ancient insights.  IJCNN92-Beijing Proceed­ings, IEEE, New York, 1992.  An updated version appears in Werbos (1994c).

Werbos, P. J. (1993a). Supervised learning: Can it escape from its local minimum.   WCNN93 Proceedings.  Hillsdale, NJ: Lawrence Erlbaum Associates.

Werbos, P. J. (1993b).  Quantum theory and neural systems: alterna­tive approaches and a new design, in K. H. Pribram (Ed.), Rethinking Neural Networks: Quantum Fields and Biological Data.  Hillsdale, NJ: Lawrence Erlbaum Associates.

Werbos, P. J. (1993c).  Quantum theory, computing and chaotic solitons.  IEICE Transactions on Fundamentals, Vol E76-A, No. 5, May 1993; Chaotic solitons and the foundations of physics: a potential revolution.  Applied Mathematics and  Computation, 56, No.2/3, July 1993.

Werbos, P. J.  (1994a).  Rational approaches to identifying policy objectives.  Energy, Vol. 15, No.3/4, 1990.  Reprinted in J. Weyant and T. Kuczmowski (Eds.), Systems Modeling Handbook, Pergamon, 1994.

Werbos, P. J. (1994b).  The brain as a neurocontroller: New hypotheses and new experimental possibilities.  In K. H. Pribram (Ed.),  Origins..., Proceedings of the Second Appala­chian Conference.  Hillsdale, NJ: Lawrence Erlbaum Associates.

Werbos, P. J. (1994c).  The Roots of Backpropagation: From Ordered Derivatives to Neural Networks and Political Forecasting.  New York: John Wiley and Sons.

Werbos, P. J. (1994d). How we cut prediction errors in half by using a different training method. World Congress on Neural Networks, San Diego.  Hillsdale, NJ: Lawrence Erlbaum Associ­ates.  WCNN94 Proceedings.

Werbos, P. J. (1994e).  Self-organization: reexamining the basics, and an alternative to the big bang.  In K. H. Pribram (Ed.), Origins: Brain and Self Organization.  Hillsdale, NJ: Lawrence Erlbaum Associates.  For more mathe­matical treatments focusing solely on the issue of quantum foundations as such, see Werbos (1993c).

Werbos. P.J. (1995a).  Neural networks for flight control: a strategic and scientific assessment. In M.Padgett (Ed., Auburn U.), 1994 Workshop on Neural Networks, Fuzzy Systems, Evolutionary Systems and Virtual Reality, Bellingham, Washington: Society of Photo-Optical Instrumentation Engineers (SPIE). ISBN No. 1-5655-044-7.

Werbos, P.J. (1995b) Control. In E.Fiesler and R.Beale (Eds.), Handbook of Neural Computation, New York: Oxford U. Press.

White, D., & Sofge, D. (1992).  Handbook of Intelligent Control: Neural, Fuzzy and Adaptive Approaches.  New York: Van Nos­trand.

[1] The views herein are purely my personal views, oversimplified in places to make a point.  They certainly do not in any way represent the views of any of my employers past and present, one of whom remains a close friend and supporter even though he is totally aghast at Section 7 and the Appendix.