Only a catastrophe that wipes out the entire process.

SUCH AS AN ALL‐OUT NUCLEAR WAR?

Thatʹs one scenario, but in the next century, we will encounter a plethora of other ʺfailure modes.ʺ Weʹll talk about this in later chapters.

I CANʹT WAIT. NOW TELL ME THIS, WHAT DOES THE LAW OF ACCELERATING RETURNS HAVE TO DO

WITH THE TWENTY‐FIRST CENTURY?

Exponential trends are immensely powerful but deceptive. They linger for eons with very little effect. But once they reach the ʺknee of the curve,ʺ they explode with unrelenting fury. With regard to computer technology and its impact on human society, that knee is approaching with the new millennium. Now I have a question for you.

SHOOT.

Just who are you anyway?

WHY, IʹM THE READER.

Of course. Well, itʹs good to have you contributing to the book while thereʹs still time to do something about it.

GLAD TO. NOW, YOU NEVER DID GIVE THE ENDING TO THE EMPEROR STORY. SO DOES THE EMPEROR

LOSE HIS EMPIRE, OR DOES THE INVENTOR LOSE HIS HEAD?

I have two endings, so I just canʹt say.

MAYBE THEY REACH A COMPROMISE SOLUTION. THE INVENTOR MIGHT BE HAPPY TO SETTLE FOR, SAY,

JUST ONE PROVINCE OF CHINA.

Yes, that would be a good result. And maybe an even better parable for the twenty‐first century.

‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐

C H A P T E R T W O

THE INTELLIGENCE OF

EVOLUTION

Hereʹs another critical question for understanding the twenty‐first century: Can an intelligence create another intelligence more intelligent than itself?

Letʹs first consider the intelligent process that created us: evolution.

Evolution is a master programmer. It has been prolific, designing millions of species of breathtaking diversity and

ingenuity. And thatʹs just here on Earth. The software programs have been all written down, recorded as digital data in the chemical structure of an ingenious molecule called deoxyribonucleic acid, or DNA. DNA was first described by

J. D. Watson and E H. C. Crick in 1953 as a double helix consisting of a twisting pair of strands of polynucleotides with two bits of information encoded at each ledge of a spiral staircase, encoded by the choice of nucleotides. [1] This master ʺread onlyʺ memory controls the vast machinery of life.

Supported by a twisting sugar‐phosphate backbone, the DNA molecule consists of between several dozen and several million rungs, each of which is coded with one nucleotide letter drawn from a four‐letter alphabet of base pairs (adenine‐thymine, thymine‐adenine, cytosine‐guanine, and guanine‐cytosine). Human DNA is a long

molecule—it would measure up to six feet in length if stretched out‐but it is packed into an elaborate coil only 1/2500 of an inch across.

The mechanism to peel off copies of the DNA code consists of other special machines: organic molecules called enzymes, which split each base pair and then assemble two identical DNA molecules by rematching the broken base

pairs. Other little chemical machines then verify the validity of the copy by checking the integrity of the base‐pair matches. The error rate of these chemical information‐processing transactions is about one error in a billion base‐pair replications. There are further redundancy and error‐correction codes built into the data itself, so meaningful mistakes are rare. Some mistakes do get through, most of which cause defects in a single cell. Mistakes in an early fetal cell may cause birth defects in the newborn organism. Once in a long while such defects offer an advantage, and this new encoding may eventually be favored through the enhanced survival of that organism and its offspring.

The DNA code controls the salient details of the construction of every cell in the organism, including the shapes

and processes of the cell, and of the organs comprised of the cells. In a process called translation, other enzymes translate the coded DNA information by building proteins. It is these proteins that define the structure, behavior, and intelligence of each cell, and of the organism. [2]

This computational machinery is at once remarkably complex and amazingly simple. Only four base pairs provide

the data storage for the complexity of all the millions of life‐forms on Earth, from primitive bacteria to human beings.

The ribosomes—little tape‐recorder molecules—read the code and build proteins from only twenty amino acids. The

synchronized flexing of muscle cells, the intricate biochemical interactions in our blood, the structure and functioning of our brains, and all of the other diverse functions of the Earthʹs creatures are programmed in this efficient code.

The genetic information‐processing appliance is an existence proof of nano‐engineering (building machines atom

by atom), because the machinery of life indeed takes place on the atomic level. Tiny bits of molecules consisting of just dozens of atoms encode each bit and perform the transcription, error detection, and correction functions. The actual building of the organic stuff is conducted atom by atom with the building of the amino acid chains.

This is our understanding of the hardware of the computational engine driving life on Earth. We are just beginning, however, to unravel the software. While prolific, evolution has been a sloppy programmer. It has left us

the object code (billions of bits of coded data), but there is no higher‐level source code (statements in a language we can understand), no explanatory comments, no ʺhelpʺ file, no documentation, and no user manual. Through the Human Genome Project, we are in the process of writing down the 6‐billion‐bit code for the human genetic code, and

are capturing the code for thousands of other species as well. [3] But reverse engineering the genome code—

understanding how it works—is a slow and laborious process that we are just beginning. As we do this, however, we

are learning the information‐processing basis of disease, maturation, and aging, and are gaining the means to correct and refine evolutionʹs unfinished invention.

In addition to evolutionʹs lack of documentation, it is also a very inefficient programmer. Most of the code—97

percent according to current estimates—does not compute; that is, most of the sequences do not produce proteins and

appear to be useless. That means that the active part of the code is only about 23 megabytes, which is less than the code for Microsoft Word. The code is also replete with redundancies. For example, an apparently meaningless sequence called Alu, comprising 300 nucleotide letters, occurs 300,000 times in the human genome, representing more

than 3 percent of our genetic program.

The theory of evolution states that programming changes are introduced essentially at random. The changes are

evaluated for retention by survival of the entire organism and its ability to reproduce. Yet the genetic program controls not just the one characteristic being ʺexperimentedʺ with, but millions of other features as well, Survival of the fittest appears to be a crude technique capable of concentrating on one or at most a few characteristics at a time.

Since the vast majority of changes make things worse, it may seem surprising that this technique works at all.

This contrasts with the conventional human approach to computer programming in which changes are designed

with a purpose in mind, multiple changes may be introduced at a time, and the changes made are tested by focusing

in on each change, rather than by overall survival of the program. If we attempted to improve our computer programs the way that evolution apparently improves its design, our programs would collapse from increasing randomness.

It is remarkable that by concentrating on one refinement at a time, such elaborate structures as the human eye could have been designed. Some observers have postulated that such intricate design is impossible through the incremental‐refinement method that evolution uses. A design as intricate as the eye or the heart would appear to require a design methodology in which it was designed all at once.

However, the fact that designs such as the eye have many interacting aspects does not rule out its creation through a design path comprising one small refinement at a time. In utero, the human fetus appears to go through a

process of evolution, although whether this is a corollary of the phases of evolution that led to our subspecies is not universally accepted. Nonetheless, most medical students learn that ontogeny (fetal development) recapitulates phylogeny (evolution of a genetically related group of organisms, such as a phylum). We appear to start out in the womb with similarities to a fish embryo, progress to an amphibian, then a mammal, and so on. Regardless of the phylogeny controversy, we can see in the history of evolution the intermediate design drafts that evolution went through in designing apparently ʺcompleteʺ mechanisms such as the human eye. Even though evolution focuses on

just one issue at a time, it is indeed capable of creating striking designs with many interacting parts.

There is a disadvantage, however, to evolutionʹs incremental method of design: It canʹt easily perform complete redesigns. It is stuck, for example, with the very slow computing speed of the mammalian neuron. But there is a way

around this, as we will explore in chapter 6, ʺBuilding New Brains.ʺ

The Evolution of Evolution

There are also certain ways in which evolution has evolved its own means for evolution. The DNA‐based coding itself

is clearly one such means. Within the code, other means have developed. Certain design elements, such as the shape

of the eye, are coded in a way that makes mutations less likely. The error detection and correction mechanisms built into the DNA‐based coding make changes in these regions very unlikely. This enforcement of design integrity for certain critical features evolved because they provide an advantage—changes to these characteristics are usually catastrophic. Other design elements, such as the number and layout of light‐sensitive rods and cones in the retina, have fewer design enforcements built into the code. If we examine the evolutionary record, we do see more recent change in the layout of the retina than in the shape of the eyeball itself. So in certain ways, the strategies of evolution have evolved. The Law of Accelerating Returns says that it should, for evolving its own strategies is the primary way that an evolutionary process builds on itself.

By simulating evolution, we can also confirm the ability of evolutionʹs ʺone step at a timeʺ design process to build ingenious designs of many interacting elements. One example is a software simulation of the evolution of life‐forms

called Network Tierra designed by Thomas Ray, a biologist and rain forest expert. [4] Rayʹs creaturesʺ are software simulations of organisms in which each ʺcellʺ has its own DNA‐like genetic code. The organisms compete with each

other for the limited simulated space and energy resources of their simulated environment.

A unique aspect of this artificial world is that the creatures have free rein of 150 computers on the Internet, like ʺislands in an archipelagoʺ according to Ray. One of the goals of this research is to understand how the explosion of diverse body plans that occurred on Earth during the Cambrian period some 570 million years ago was possible. ʺTo

watch evolution unfold is a thrill,ʺ Ray exclaimed as he watched his creatures evolve from unspecialized single‐celled organisms to multicellular organisms with at least modest increases in diversity. Ray has reportedly identified the equivalent of parasites, immunities, and crude social interaction. One of the acknowledged limitations in Rayʹs simulation is a lack of complexity in his simulated environment. One insight of this research is the need for a suitably chaotic environment as a key resource needed to push evolution along, a resource in ample supply in the real world.

A practical application of evolution is the area of evolutionary algorithms, in which millions of evolving computer

programs compete with one another in a simulated evolutionary process, thereby harnessing the inherent intelligence

of evolution to solve real‐world problems. Since the intelligence of evolution is weak, we focus and amplify it the same way a lens concentrates the sparse rays of the sun. Weʹll talk more about this powerful approach to software design in chapter 4, ʺA New Form of Intelligence on Earth.ʺ

The Intelligence Quotient of Evolution

Let us first praise evolution. It has created a plethora of designs of indescribable beauty, complexity, and elegance, not to mention effectiveness. Indeed, some theories of aesthetics define beauty as the degree of success in emulating the natural beauty that evolution has created. It created human beings with their intelligent human brains, beings smart enough to create their own intelligent technology.

Its intelligence seems vast. Or is it? It has one deficiency—evolution is very slow. While it is true that it has created some remarkable designs, it has taken an extremely long period of time to do so. It took eons for the process to get started and, for the evolution of life‐forms, eons meant billions of years. Our human forebears also took eons to get started in their creation of technology, but for us eons meant only tens of thousands of years, a distinct improvement.

Is the length of time required to solve a problem or create an intelligent design relevant to an evaluation of intelligence?

The authors of our human intelligence‐quotient tests seem to think so, which is why most IQ tests are timed. We

regard solving a problem in a few seconds better than solving it in a few hours or years. Periodically, the timed aspect of IQ tests gives rise to controversy, but it shouldnʹt. The speed of an intelligent process is a valid aspect of its evaluation. If a large, hunched, catlike animal perched on a tree limb suddenly appears out of my left cornea, designing an evasive tactic in a second or two is preferable to pondering the challenge for a few hours. If your boss asks you to design a marketing program, she probably doesnʹt want to wait a hundred years. Viking Penguin wanted

this book delivered before the end of the second, not the third, millennium. [5]

Evolution has achieved an extraordinary record of design, yet has taken an extraordinarily long period of time to

do so. If we factor its achievements by its ponderous pace, I believe we need to conclude that its intelligence quotient is only infinitesimally greater than zero. An IQ of only slightly greater than zero (defining truly arbitrary behavior as zero) is enough for evolution to beat entropy and create wonderful designs, given enough time, in the same way that

an ever so slight asymmetry in the balance between matter and antimatter was enough to allow matter to almost completely overtake its antithesis.

Evolution is thereby only a quantum smarter than completely unintelligent behavior. The reason that our human‐

created evolutionary algorithms are effective is that we speed up time a million‐ or billionfold, so as to concentrate and focus its otherwise diffuse power. in contrast, humans are a lot smarter than just a quantum greater than total stupidity (of course, your view may vary depending on the latest news reports).

THE LIFE CYCLE OF A TECHNOLOGY

What does the Law of Time and Chaos say about the end of the Universe? One theory is that the Universe

will continue its expansion forever. Alternatively, if there's enough stuff, then the force of the Universe's own

gravity will stop the expansion, resulting in a final "big crunch". Unless, of course, there's an antigravity force.

Or if the "cosmological constant" Einstein's "fudge factor," is big enough. I've had to rewrite this paragraph three times over the past several months because the physicists can't make up their minds. The latest

speculation apparently favors indefinite expansion.

Personally, I prefer the idea of the Universe closing in again on itself as more aesthetically pleasing. That

would mean that the Universe would reverse its expansion and reach a singularity again. We can speculate

that it would again expand and contract in an endless cycle. Most things in the Universe seem to move in

cycles, so why not the Universe itself? The Universe could then be regarded as a tiny wave particle in some

other really big Universe. And that big Universe would itself be a vibrating particle in yet another even bigger

Universe. Conversely, the tiny wave particles in our Universe can each be regarded as little Universes with

each of their vibrations lasting fractions of a trillionth of a second in our Universe representing billions of years of expansion and contraction in that little Universe. And each particle in those little Universes could

be. . . okay, so I'm getting a little carried away.

How to Unsmash a Cup

Let's say the Universe reverses its expansion. The phase of contraction has the opposite characteristics of

the phase of expansion that we are now in. Clearly, chaos in the Universe will be decreasing as the Universe

gets smaller. I can see that this is the case by considering the endpoint, which is again a singularity with no

size, and therefore no disorder.

We regard time as moving in one direction because processes in time are not generally reversible. If we

smash a cup, we find it difficult to unsmash it. The reason for this has to do with the second law of thermodynamics. Since overall entropy may increase but can never decrease, time has directionality.

Smashing a cup increases randomness. Unsmashing the cup would violate the second law of thermodynamics.

Yet in the contracting phase of the Universe, chaos is decreasing, so we should regard time's direction as

reversed.

This reverses all processes in time, turning evolution into devolution. Time moves backward during the

second half of the Universe's time span. So if you smash a favorite cup, try to do it as we approach the midpoint of the Universe's time span. You should find the cup coming together again as we cross over into the

Universe's contracting phase of its time span.

Now if time is moving backward during this contracting phase, what we (living in the expanding phase of

the Universe) look forward to as the big crunch is actually a big bang to the creatures living (in reverse time)

during the contracting phase. Consider the perspective of these time-reversed creatures living in what we

regard as the contracting phase of the Universe. From their perspective, what we regard as the second phase

is actually their first phase, with time going in the reverse direction. So from their perspective, the Universe

during this phase is expanding, not contracting. Thus, if the "Universe will eventually contract" theory is correct, it would be proper to say that the Universe is bounded in time by two big bangs, with events flowing

in opposite directions in time from each big bang, meeting in the middle. Creatures living in both phases can

say that they are in the first half of the Universe's history, since both phases will appear to be the first half to creatures living in those phases. And in both halves of the time span of the Universe, the Law of Entropy, the

Law of Time and Chaos, and the Law of Accelerating Returns (as applied to evolution) all hold true, but with

time moving in opposite directions. [6]

The End of Time

And what if the Universe expands indefinitely? This would mean that the stars and galaxies will eventually

exhaust their energy, leaving a Universe of dead stars expanding forever. That would leave a big mess—lots

of randomness—and no meaningful order, so according to the Law of Time and Chaos, time would gradually

come to a halt. Consistently, if a dead Universe means that there will be no conscious beings to appreciate it,

then both the Quantum Mechanical and the Eastern subjective viewpoints appear to imply that the Universe

would cease to exist.

In my view, neither conclusion is quite right. At the end of this book, I'll share with you my perspective of

what happens at the end of the Universe. But don't look ahead.

Consider the sophistication of our creations over a period of only a few thousand years. Ultimately, our machines

will match and exceed human intelligence, no matter how one cares to define or measure this elusive term. Even if my time frames are off, few serious observers who have studied the issue claim that computers will never achieve and surpass human intelligence. Humans will have vastly beaten evolution, therefore, achieving in a matter of only thousands of years as much or more than evolution achieved in billions of years. So human intelligence, a product of evolution, is far more intelligent than its creator.

And so, too, will the intelligence that we are creating come to exceed the intelligence of its creator. That is not the case today. But as the rest of this book will argue, it will take place very soon—in evolutionary terms, or even in terms of human history—and within the lifetimes of most of the readers of this book. The Law of Accelerating Returns predicts it. And furthermore, it predicts that the progression in the capabilities of human‐created machines will only continue to accelerate. The human species creating intelligent technology is another example of evolutionʹs progress building on itself. Evolution created human intelligence. Now human intelligence is designing intelligent machines at a far faster pace, Yet another example will be when our intelligent technology takes control of the creation of yet more intelligent technology than itself.

‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐

NOW ON THIS TIME THING, WE START OUT AS A SINGLE CELL, RIGHT?

Thatʹs right.

AND THEN WE DEVELOP INTO SOMETHING RESEMBLING A FISH, THEN AN AMPHIBIAN, ULTIMATELY A

MAMMAL, AND SO ON—YOU KNOW ONTOGENY RECAPITULATES—

Phylogeny, yes.

SO THATʹS JUST LIKE EVOLUTION, RIGHT? WE GO THROUGH EVOLUTION IN OUR MOTHERʹS WOMB.

Yes, thatʹs the theory. The word phylogeny is derived from phylum . . .

BUT YOU SAID THAT IN EVOLUTION, TIME SPEEDS UP. YET IN AN ORGANISMʹS LIFE, TIME SLOWS DOWN.

Ah yes, a good catch, I can explain.

IʹM ALL EARS.

The Law of Time and Chaos states that, in a process the average time interval between salient events is proportional to the amount of chaos in the process. So we have to be careful to define precisely what constitutes the process. It is true that evolution started out with single cells. And we also start out as a single cell. Sounds similar, but from the perspective of the Law of Time and Chaos, itʹs not. We start out as just one cell. When evolution was at the point of single cells, it was not one cell, but many trillions of cells. And these cells were just swirling about; thatʹs a lot of chaos and not much order. The primary movement of evolution has been toward greater order. In the development of an

organism, however, the primary movement is toward greater chaos—the grown organism has far greater disorder than the single cell it started out as. It draws that chaos from the environment as its cells multiply, and as it has encounters with its environment. Is that clear?

UH, SURE. BUT DONʹT QUIZ ME ON IT. I THINK THE GREATEST CHAOS IN MY LIFE WAS WHEN I LEFT

HOME TO GO TO COLLEGE. THINGS ARE JUST BEGINNING TO SETTLE DOWN NOW AGAIN.

I never said the Law of Time and Chaos explains everything.

OKAY, BUT EXPLAIN THIS. YOU SAID THAT EVOLUTION WASNʹT VERY SMART, OR AT LEAST WAS RATHER

SLOW‐WITTED. BUT ARENʹT SOME OF THESE VIRUSES AND BACTERIA USING EVOLUTION TO OUTSMART

US?

Evolution operates on different timescales. If we speed it up, it can be smarter than us. Thatʹs the idea behind software programs that apply a simulated evolutionary process to solving complex problems. Pathogen evolution is

another example of the ability of evolution to amplify and focus its diffuse powers. After all, a viral generation can take place in minutes or hours compared to decades for the human race. However, I do think we will ultimately prevail against the evolutionary tactics of our disease agents.

IT WOULD BE HELPFUL IF WE STOPPED OVERUSING ANTIBIOTICS.

Yes, and that brings up another issue, which is whether the human species is more intelligent than its individual members.

AS A SPECIES, WEʹRE CERTAINLY PRETTY SELF‐DESTRUCTIVE.

Thatʹs often true. Nonetheless, we do have a profound species‐wide dialogue going on. In other species, the individuals may communicate in a small clan or colony, but there is little, if any, sharing of information beyond that, and little apparent accumulated knowledge. The human knowledge base of science, technology, art, culture, and history has no parallel in any other species.

WHAT ABOUT WHALE SONGS?

Hmmm. I guess we just donʹt know what theyʹre singing about.

AND WHAT ABOUT THOSE APES THAT YOU CAN TALK TO ON THE INTERNET?

Well, on April 27, 1998, Koko the gorilla did engage in what her mentor, Francine Patterson, called the first interspecies chat, on America Online. [8] But Kokoʹs critics intimate that Patterson is the brains behind Koko.

BUT PEOPLE WERE ABLE TO CHAT WITH KOKO ONLINE.

Yes. However, Koko is rusty on her typing skills, so questions were interpreted by Patterson into American Sign Language, which Koko observed, and then Kokoʹs signed responses were interpreted by Patterson back into typed responses. I guess the suspicion is that Patterson is like those language interpreters from the diplomatic corps—one wonders if youʹre communicating with the dignitary, in this case Koko, or the interpreter.

ISNʹT IT CLEAR IN GENERAL THAT THE APES ARE COMMUNICATING? THEYʹRE NOT THAT DIFFERENT

FROM US GENETICALLY, AS YOU SAID.

Thereʹs clearly some form of communication going on. The question being addressed by the linguistics community is

whether the apes can really deal with the levels of symbolism embodied in human language. I think that Dr. Emily

Savage‐Rumbaugh of Georgia State University, who runs a fifty‐five‐acre ape‐communication laboratory, made a fair

statement recently when she said, ʺThey. [her critics] are asking Kanzi [one of her ape subjects] to do everything that humans do, which is specious. Heʹll never do that. It still doesnʹt negate what he can do.ʺ

WELL, IʹM ROOTING FOR THE APES.

Yes, it would be nice to have someone to talk to when we get tired of other humans.

SO WHY DONʹT YOU JUST HAVE A LITTLE TALK WITH YOUR COMPUTER?

I do talk to my computer, and it dutifully takes down what I say to it. And I can give commands by speaking in natural language to Microsoft Word, [9] but itʹs still not a very engaging conversationalist. Remember, computers are still a million times simpler than the human brain, so itʹs going to be a couple of decades yet before they become comforting companions.

BACK ON THIS INDIVIDUAL‐VERSUS‐GROUP‐INTELLIGENCE ISSUE, ARENʹT MOST ACHIEVEMENTS IN ART

AND SCIENCE ACCOMPLISHED BY INDIVIDUALS? YOU KNOW, YOU CANʹT WRITE A SONG OR PAINT A

PICTURE BY COMMITTEE.

Actually, a lot of important science and technology is done in large groups.

BUT ARENʹT THE REAL BREAKTHROUGHS DONE BY INDIVIDUALS?

In many cases, thatʹs true. Even then, the critics and the technology conservatives, even the intolerant ones, do play an important screening role. Not every new and different idea is worth pursuing. Itʹs worthwhile having some barriers

to break through.

Overall, the human enterprise is clearly capable of achievements that go far beyond what we can do as individuals.

HOW ABOUT THE INTELLIGENCE OF A LYNCH MOB?

I suppose a group is not always more intelligent than its members.

WELL, I HOPE THOSE TWENTY‐FIRST‐CENTURY MACHINES DONʹT EXHIBIT OUR MOB PSYCHOLOGY.

Good point.

I MEAN, I WOULDNʹT WANT TO END UP IN A DARK ALLEY WITH A BAND OF UNRULY MACHINES.

We should keep that in mind as we design our future machines. Iʹll make a little note . . .

YES, PARTICULARLY BEFORE THE MACHINES START, AS YOU SAID, DESIGNING THEMSELVES.

‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐

C H A P T E R T H R E E

OF MIND AND MACHINES

PHILOSOPHICAL MIND EXPERIMENTS

ʺI am lonely and bored; please keep me company.ʺ

If your computer displayed this message on its screen, would that convince you that your notebook is conscious

and has feelings?

Well, clearly no, itʹs rather trivial for a program to display such a message. The message actually comes from the

presumably human author of the program that includes the message. The computer is just a conduit for the message,

like a book or a fortune cookie.

Suppose we add speech synthesis to the program and have the computer speak its plaintive message. Have we changed anything?

While we have added technical complexity to the program, and some humanlike communication means, we still

do not regard the computer as the genuine author of the message.

Suppose now that the message is not explicitly programmed, but is produced by a game‐playing program that contains a complex model of its own situation. The specific message may never have been foreseen by the human creators of the program. It is created by the computer from the state of its own internal model as it interacts with you, the user. Are we getting closer to considering the computer as a conscious, feeling entity?

Maybe just a tad. But if we consider contemporary game software, the illusion is probably short‐lived as we gradually figure out the methods and limitations behind the computerʹs ability for small talk.

Now suppose the mechanisms behind the message grow to become a massive neural net, built from silicon but based on a reverse engineering of the human brain. Suppose we develop a learning protocol for this neural net that

enables it to learn human language and model human knowledge. Its circuits are a million times faster than human

neurons, so it has plenty of time to read all human literature and develop its own conceptions of reality, Its creators do not tell it how to respond to the world. Suppose now that it says, ʺIʹm lonely . . .ʺ

At what point do we consider the computer to be a conscious agent with its own free will? These have been the

most vexing problems in philosophy since the Platonic dialogues illuminated the inherent contradictions in our conception of these terms.

Letʹs consider the slippery slope from the opposite direction. Our friend Jack (circa some time in the twenty‐first

century) has been complaining of difficulty with his hearing. A diagnostic test indicates he needs more than a conventional hearing aid, so he gets a cochlear implant. Once used only by people with severe hearing impairments,

these implants are now common to correct the ability of people to hear across the entire sonic spectrum. This routine surgical procedure is successful, and Jack is pleased with his improved hearing.

Is he still the same person?

Well, sure he is. People have cochlear implants circa 1999. We still regard them as the same person.

Now (back to circa sometime in the twenty‐first century), Jack is so impressed with the success of his cochlear implants that he elects to switch on the built‐in phonic‐cognition circuits, which improve overall auditory perception.

These circuits are already built in so that he does not require another insertion procedure should he subsequently decide to enable them. By activating these neural‐replacement circuits, the phonics‐detection nets built into the implant bypass his own aging neural‐phonics regions. His cash account is also debited for the use of this additional neural software. Again, Jack is pleased with his improved ability to understand what people are saying.

Do we still have the same Jack? Of course; no one gives it a second thought.

Jack is now sold on the benefits of the emerging neural‐implant technology. His retinas are still working well, so

he keeps them intact (although he does have permanently implanted retinal‐imaging displays in his corneas to view

virtual reality), but he decides to try out the newly introduced image‐processing implants, and is amazed how much

more vivid and rapid his visual perception has become.

Still Jack? Why, sure.

Jack notices that his memory is not what it was, as he struggles to recall names, the details of earlier events, and so on. So heʹs back for memory implants. These are amazing‐memories that had grown fuzzy with time are now as clear

as if they had just happened. He also struggles with some unintended consequences as he encounters unpleasant memories that he would have preferred to remain dim.

Still the same Jack? Clearly he has changed in some ways and his friends are impressed with his improved faculties. But he has the same self‐deprecating humor, the same silly grin—yes, itʹs still the same guy.

So why stop here? Ultimately Jack will have the option of scanning his entire brain and neural system (which is

not entirely located in the skull) and replacing it with electronic circuits of far greater capacity, speed, and reliability.

Thereʹs also the benefit of keeping a backup copy in case anything happened to the physical Jack.

Certainly this specter is unnerving, perhaps more frightening than appealing. And undoubtedly it will be controversial for a long time (although according to the Law of Accelerating Returns, a ʺlong timeʺ is not as long as it used to be). Ultimately, the overwhelming benefits of replacing unreliable neural circuits with improved ones will be too compelling to ignore.

Have we lost Jack somewhere along the line? Jackʹs friends think not. Jack also claims that heʹs the same old guy,

just newer. His hearing, vision, memory, and reasoning ability have all improved, but itʹs still the same Jack.

However, letʹs examine the process a little more carefully Suppose rather than implementing this change a step at

a time as in the above scenario, Jack does it all at once. He goes in for a complete brain scan and has the information from the scan instantiated (installed) in an electronic neural computer. Not one to do things piecemeal, he upgrades his body as well. Does making the transition at one time change anything? Well, whatʹs the difference between changing from neural circuits to electronic/photonic ones all at once, as opposed to doing it gradually? Even if he makes the change in one quick step, the new Jack is still the same old Jack, right?

But what about Jackʹs old brain and body? Assuming a noninvasive scan, these still exist. This is Jack! Whether the

scanned information is subsequently used to instantiate a copy of Jack does not change the fact that the original Jack still exists and is relatively unchanged. Jack may not even be aware of whether or not a new Jack is ever created. And for that matter, we can create more than one new Jack.

If the procedure involves destroying the old Jack once we have conducted some quality‐assurance steps to make

sure the new Jack is fully functional, does that not constitute the murder (or suicide) of Jack?

Suppose the original scan of Jack is not noninvasive, that it is a ʺdestructiveʺ scan. Note that technologically speaking, a destructive scan is much easier—in fact we have the technology today (1999) to destructively scan frozen neural sections, ascertain the interneuronal wiring, and reverse engineer the neuronsʹ parallel digital‐analog algorithms. [1] We donʹt yet have the bandwidth to do this quickly enough to scan anything but a very small portion

of the brain. But the same speed issue existed for another scanning project—the human genome scan—when the project began. At the speed that researchers were able to scan and sequence the human genetic code in 1991, it would have taken thousands of years to complete the project. Yet a fourteen‐year schedule was set, which it now appears will be successfully realized. The Human Genome Project deadline obviously made the (correct) assumption that the

speed of our methods for sequencing DNA codes would greatly accelerate over time. The same phenomenon will hold true for our human‐brain‐scanning projects. We can do it now—very slowly—but that speed, like most everything else governed by the Law of Accelerating Returnsʹ will get exponentially faster in the years ahead.

Now suppose as we destructively scan Jack, we simultaneously install this information into the new Jack. We can

consider this a process of ʺtransferringʺ Jack to his new brain and body. So one might say that Jack is not destroyed, just transferred into a more suitable embodiment. But is this not equivalent to scanning Jack noninvasively, subsequently instantiating the new Jack and then destroying the old Jack? if that sequence of steps basically amounts to killing the old Jack, then this process of transferring Jack in a single step must amount to the same thing. Thus we can argue that any process of transferring Jack amounts to the old Jack committing suicide, and that the new Jack is not the same person.

The concept of scanning and reinstantiation of the information is familiar to us from the fictional ʺbeam me upʺ

teleportation technology of Star Trek. In this fictional show, the scan and reconstitution is presumably on a nanoengineering scale, that is, particle by particle, rather than just reconstituting the salient algorithms of neural-information processing envisioned above. But the concept is very similar. Therefore, it can be argued that the Star Trek characters are committing suicide each time they teleport, with new characters being created. These new characters, while essentially identical, are made up of entirely different particles, unless we imagine that it is the actual particles being beamed to the new destination. Probably it would be easier to beam just the information and use local particles to instantiate the new embodiments. Should it matter? Is consciousness a function of the actual particles or just of their pattern and organization?

We argue that consciousness and identity are not a function of the specific particles at all, because our own particles are constantly changing. On a cellular basis, we change most of our cells (although not our brain cells) over a period of several years. [2] On an atomic level, the change is much faster than that, and does include our brain cells.

We are not at all permanent collections of particles. It is the patterns of matter and energy that are semipermanent (that is, changing only gradually), but our actual material content is changing constantly, and very quickly. We are rather like the patterns that water makes in a stream. The rushing water around a formation of rocks makes a particular, unique pattern. This pattern may remain relatively unchanged for hours, even years. Of course, the actual material constituting the pattern—the water—is totally replaced within milliseconds. This argues that we should not

associate our fundamental identity with specific sets of particles, but rather the pattern of matter and energy that we represent. This, then, would argue that we should consider the new Jack to be the same as the old Jack because the

pattern is the same. (One might quibble that while the new Jack has similar functionality to the old Jack, he is not identical. However, this just dodges the essential question, because we can reframe the scenario with a nanoengineering technology that copies Jack atom by atom rather than just copying his salient information-processing algorithms.)

Contemporary philosophers seem to be partial to the ʺidentity from patternʺ argument. And given that our pattern changes only slowly in comparison to our particles, there is some apparent merit to this view. But the counter to that argument is the ʺold Jackʺ waiting to be extinguished after his ʺpatternʺ has been scanned and installed in a new computing medium. Old Jack may suddenly realize that the ʺidentity from patternʺ argument is flawed.

MIND AS MACHINE VERSUS MIND BEYOND MACHINE

Science cannot solve the ultimate mystery of nature because in the last analysis we are part of the mystery we are trying to solve.

—Max Planck

Is all what we see or seem, but a dream within a dream?

—Edgar Allan Poe

What if everything is an illusion and nothing exists? in that case, I definitely overpaid for my carpet.

—Woody Allen

The Difference Between Objective and Subjective Experience

Can we explain the experience of diving into a lake to someone who has never been immersed in water? How about

the rapture of sex to someone who has never had erotic feelings (assuming one could find such a person)? Can we explain the emotions evoked by music to someone congenitally deaf? A deaf person will certainly learn a lot about music: watching people sway to its rhythm, reading about its history and role in the world. But none of this is the same as experiencing a Chopin prelude.

If I view light with a wavelength of 0.000075 centimeters, I see red. Change the wavelength to 0.000035 centimeters

and I see violet. The same colors can also be produced by mixing colored lights. If red and green lights are properly combined, I see yellow. Mixing pigments works differently from changing wavelengths, however, because pigments

subtract colors rather than add them. Human perception of color is more complicated than mere detection of electro‐

magnetic frequencies, and we still do not fully understand it. Yet even if we had a fully satisfactory theory of our mental process, it would not convey the subjective experience of redness, or yellowness. I find language inadequate

for expressing my experience of redness. Perhaps I can muster some poetic reflections about it, but unless youʹve had the same encounter, it is really not possible for me to share my experience.

So how do I know that you experience the same thing when you talk about redness? Perhaps you experience red

the way I experience blue, and vice versa. How can we test our assumptions that we experience these qualities the same way? Indeed, we do know there are some differences. Since I have what is misleadingly labeled ʺred‐green color‐blindness, there are shades of color that appear identical to me that appear different to others. Those of you without this disability apparently have a different experience than I do. What are you all experiencing? Iʹll never know.

Giant squids are wondrous sociable creatures with eyes similar in structure to humans (which is surprising, given

their very different phylogeny) and possessing a complex nervous system. A few fortunate human scientists have developed relationships with these clever cephalopods. So what is it like to be a giant squid? When we see it respond to danger and express behavior that reminds us of a human emotion, we infer an experience that we are familiar with.

But what of their experiences without a human counterpart?

Or do they have experiences at all? Maybe they are just like ʺmachinesʺ—responding programmatically to stimuli

in their environment. Maybe there is no one home. Some humans are of this view—only humans are conscious; animals just respond to the world by ʺinstinct,ʺ that is, like a machine. To many other humans, this author included, it seems apparent that at least the more evolved animals are conscious creatures, based on empathetic perceptions of animals expressing emotions that we recognize as correlates of human reactions. Yet even this is a human‐centric way of thinking in that it only recognizes subjective experiences with a human equivalent. Opinion on animal consciousness is far from unanimous. Indeed, it is the question of consciousness that underlies the issue of animal rights. Animal rights disputes about whether or not certain animals are suffering in certain situations result from our general inability to experience or measure the subjective experience of another entity. [3]

The not uncommon view of animals being ʺjust machinesʺ is disparaging to both animals and machines. Machines

today are still a million times simpler than the human brain. Their complexity and subtlety today is comparable to that of insects. There is relatively little speculation on the subjective experience of insects, although again, there is no convincing way to measure this. But the disparity in the capabilities of machines and the more advanced animals, such as the Homo sapiens sapiens subspecies, will be short‐lived. The unrelenting advance of machine intelligence, which we will visit in the next several chapters, will bring machines to human levels of intricacy and refinement and beyond within several decades. Will these machines be conscious?

And what about free will‐will machines of human complexity make their own decisions, or will they just follow a

program, albeit a very complex one? Is there a distinction to be made here?

The issue of consciousness lurks behind other vexing issues. Take the question of abortion. Is a fertilized egg cell a conscious human being? How about a fetus one day before birth?

Itʹs hard to say that a fertilized egg is conscious or that a full‐term fetus is not. Pro‐choice and pro‐life activists are afraid of the slippery slope in between these two definable ledges. And the slope is genuinely slippery—a human fetus develops a brain quickly, but itʹs not immediately recognizable as a human brain. The brain of a fetus becomes more humanlike gradually. The slope has no ridges to stand on. Admittedly, other hard‐to‐define questions such as

human dignity come into the debate, but fundamentally, the contention concerns sentience. In other words, when do

we have a conscious entity?

Some severe forms of epilepsy have been successfully treated by surgical removal of the impaired half of the brain. This drastic surgery needs to be done during childhood before the brain has fully matured. Either half of the brain can be removed, and if the operation is successful the child will grow up somewhat normally. Does this imply

that both halves of the brain have their own consciousness? Perhaps there are two of us in each intact brain who hopefully get along with each other. Maybe there is a whole panoply of consciousnesses lurking in one brain each with a somewhat different perspective. Is there a consciousness that is aware of mental processes that we consider unconscious?

I could go on for a long time with such conundrums. And indeed, people have been thinking about these quandaries for a long time. Plato, for one, was preoccupied with these issues. In the Phaedo, The Republic, and Theaetetus, Plato expresses the profound paradox inherent in the concept of consciousness and a humanʹs apparent ability to freely choose. On the one hand, human beings partake of the natural world and are subject to its laws. Our brains are natural phenomena and thus must follow the cause‐and‐effect laws manifest in machines and other lifeless

creations of our species. Plato was familiar with the potential complexity of machines and their ability to emulate elaborate logical processes. On the other hand, cause‐and‐effect mechanics, no matter how complex, should not, according to Plato, give rise to self‐awareness or consciousness. Plato first attempts to resolve this conflict in his theory of the Forms: Consciousness is not an attribute of the mechanics of thinking, but rather the ultimate reality of human existence. Our consciousness, or ʺsoul,ʺ is immutable and unchangeable. Thus, our mental interaction with the

physical world is on the level of the ʺmechanicsʺ of our complicated thinking process. The soul stands aloof.

But no, this doesnʹt really work, Plato realizes. If the soul is unchanging, then it cannot learn or partake in reason, because it would need to change to absorb and respond to experience. Plato ends up dissatisfied with positing consciousness in either place: the rational processes of the natural world or the mystical level of the ideal Form of the self or soul. [4]

The concept of free will reflects an even deeper paradox. Free will is purposeful behavior and decision making.

Plato believed in a ʺcorpuscular physicsʺ based on fixed and determined rules of cause and effect. But if human decision making is based on such predictable interactions of basic particles, our decisions must also be predetermined. That would contradict human freedom to choose. The addition of randomness into the natural laws is

a possibility, but it does not solve the problem. Randomness would eliminate the predetermination of decisions and

actions, but it contradicts the purposefulness of free will, as there is nothing purposeful in randomness.

Okay, letʹs put free will in the soul. No, that doesnʹt work either. Separating free will from the rational cause‐and-effect mechanics of the natural world would require putting reason and learning into the soul as well, for otherwise the soul would not have the means to make meaningful decisions. Now the soul is itself becoming a complex machine, which contradicts its mystical simplicity.

Perhaps this is why Plato wrote dialogues. That way he could passionately express both sides of these contradictory positions. I am sympathetic to Platoʹs dilemma: None of the obvious positions is really sufficient. A deeper truth can be perceived only by illuminating the opposing sides of a paradox.

Plato was certainly not the last thinker to ponder these questions. We can identify several schools of thought on

these subjects, none of them very satisfactory.

The ʺConsciousness is Just a Machine Reflecting on Itselfʺ School

A common approach is to deny the issue exists: Consciousness and free will are just illusions induced by the ambiguities of language. A slight variation is that consciousness is not exactly an illusion, but just another logical process. It is a process responding and reacting to itself. We can build that in a machine: just build a procedure that has a model of itself, and that examines and responds to its own methods. Allow the process to reflect on itself. There now you have consciousness. It is a set of abilities that evolved because self‐reflective ways of thinking are inherently more powerful.

The difficulty with arguing against the ʺconsciousness is just a machine reflecting on itselfʺ school is that this perspective is self‐consistent. But this viewpoint ignores the subjective viewpoint. It can deal with a personʹs reporting of subjective experience, and it can relate reports of subjective experiences not on to outward behavior but to patterns of neural firings as well. And if I think about it, my knowledge of the subjective experience of anyone aside from myself is no different (to me) than the rest of my objective knowledge. I donʹt experience other peopleʹs subjective experiences; I just hear about them. So the only subjective experience this school of thought ignores is my own (that is, after all, what the term subjective experience means). And, hey, Iʹm only one person among billions of humans, trillions of potentially conscious organisms; all of whom, with just one exception, are not me.

But the failure to explain my subjective experience is a serious one. It does not explain the distinction between 0.000075 centimeter electromagnetic radiation and my experience of redness. I could learn how color perception works, how the human brain processes light, how it processes combinations of light, even what patterns of neural firing this all provokes, but it still fails to explain the essence of my experience.

The Logical Positivists [5]

I am doing my best to express what I am talking about here but unfortunately the issue is not entirely effable. D. J.

Chalmers describes the mystery of the experienced inner life as the ʺhard problemʺ of consciousness, to distinguish this issue from the ʺeasy problemʺ of how the brain works. [6] Marvin Minsky observed that ʺthereʹs something queer

about describing consciousness: Whatever people mean to say, they just canʹt seem to make it clear .ʺ That is precisely the problem, says the ʺconsciousness is just a machine reflecting on itselfʺ school—to speak of consciousness other than as a pattern of neural firings is to wander off into a mystical realm beyond any hope of verification.

This objective view is sometimes referred to as logical positivism, a philosophy codified by Ludwig Wittgenstein

in his Tractatus Logico‐Philosophicus. [7] To the logical positivists, the only things worth talking about are our direct sensory experiences, and the logical inferences that we can make therefrom. Everything else ʺwe must pass over in silence,ʺ to quote Wittgensteinʹs last statement in his treatise.

Yet Wittgenstein did not practice what he preached. Published in 1953, two years after his death, his Philosophical Investigations defined those matters worth contemplating as precisely those issues he had earlier argued should be passed over in silence. [8] Apparently he came to the view that the antecedents of his last statement in the Tractatus

what we cannot speak about—are the only real phenomena worth reflecting upon. The late Wittgenstein heavily influenced the existentialists, representing perhaps the first time since Plato that a major philosopher was successful in illuminating such contradictory views.

I Think, Therefore I Am

The early Wittgenstein and the logical positivists that he inspired are often thought to have their roots in the philosophical investigations of René Descartes. [9] Descartesʹs famous dictum ʺI think, therefore I amʺ has often been cited as emblematic of Western rationalism. This view interprets Descartes to mean ʺI think, that is, I can manipulate logic and symbols, therefore I am worthwhile.ʺ But in my view, Descartes was not intending to extol the virtues of rational thought. He was troubled by what has become known as the mind‐body problem, the paradox of how mind

can arise from nonmind, how thoughts and feelings can arise to its limits, from the ordinary matter of the brain.

Pushing rational skepticism his statement really means ʺI think, that is, there is an undeniable mental phenomenon,

some awareness, occurring, therefore all we know for sure is that something—letʹs call it I—exists.ʺ Viewed in this way, there is less of a gap than is commonly thought between Descartes and Buddhist notions of consciousness as the

primary reality.

Before 2030, we will have machines proclaiming Descartesʹs dictum. And it wonʹt seem like a programmed response. The machines will be earnest and convincing. Should we believe them when they claim to be conscious entities with their own volition?

The ʺConsciousness Is a Different Kind of Stuffʺ School

The issue of consciousness and free will has been, of course, a major preoccupation of religious thought. Here we encounter a panoply of phenomena, ranging from the elegance of Buddhist notions of consciousness to ornate pantheons of souls, angels, and gods. In a similar category are theories by contemporary philosophers that regard consciousness as yet another fundamental phenomenon in the world, like basic particles and forces. I call this the ʺconsciousness is a different kind of stuffʺ school. To the extent that this school implies an interference by consciousness in the physical world that runs afoul of scientific experiment, science is bound to win because of its ability to verify its insights. To the extent that this view stays aloof from the material world, it often creates a level of complex mysticism that cannot be verified and is subject to disagreement. To the extent that it keeps its mysticism simple, it offers limited objective insight, although subjective insight is another matter (I do have to admit a fondness for simple mysticism).

The ʺWeʹre Too Stupidʺ School

Another approach is to declare that human beings just arenʹt capable of understanding the answer. Artificial intelligence researcher Douglas Hofstadter muses that ʺit could be simply an accident of fate that our brains are too weak to understand themselves. Think of the lowly giraffe, for instance, whose brain is obviously far below the level required for self‐understanding—yet it is remarkably similar to our brain.ʺ [10] But to my knowledge, giraffes are not known to ask these questions (of course, we donʹt know what they spend their time wondering about). In my view, if

we are sophisticated enough to ask the questions, then we are advanced enough to understand the answers.

However, the ʺweʹre too stupidʺ school points out that indeed we are having difficulty clearly formulating these questions.

A Synthesis of Views

My own view is that all of these schools are correct when viewed together, but insufficient when viewed one at a time. That is, the truth lies in a synthesis of these views. This reflects my Unitarian religious education in which we studied all the worldʹs religions, considering them ʺmany paths to the truth.ʺ Of course, my view may be regarded as the worst one of all. On its face, my view is contradictory and makes little sense. The other schools at least can claim some level of consistency and coherence.

Thinking Is as Thinking Does

Oh yes, there is one other view, which I call the ʺthinking is as thinking doesʺ school. In a 1950 paper, Alan Turing describes his concept of the Turing Test, in which a human judge interviews both a computer and one or more human

foils using terminals (so that the judge wonʹt be prejudiced against the computer for lacking a warm and fuzzy appearance). [11] If the human judge is unable to reliably unmask the computer (as an impostor human) then the computer wins. The test is often described as a kind of computer IQ test, a means of determining if computers have

achieved a human level of intelligence. In my view, however, Turing really intended his Turing Test as a test of thinking, a term he uses to imply more than just clever manipulation of logic and language. To Turing, thinking implies conscious intentionality.

Turing had an implicit understanding of the exponential growth of computing power, and predicted that a computer would pass his eponymous exam by the end of the century. He remarked that by that time ʺthe use of words and general educated opinion will have altered so much that one will be able to speak of machines thinking

without expecting to be contradicted.ʺ His prediction was overly optimistic in terms of time frame, but in my view not by much.

In the end, Turingʹs prediction foreshadows how the issue of computer thought will be resolved. The machines will convince us that they are conscious, that they have their own agenda worthy of our respect. We will come to believe that they are conscious much as we believe that of each other. More so than with our animal friends, we will empathize with their professed feelings and struggles because their minds will be based on the design of human thinking. They will embody human qualities and will claim to be human. And weʹll believe them.

THE VIEW FROM QUANTUM MECHANICS

I often dream about failing. Such dreams are commonplace to the ambitious or those who

climb mountains. Lately I dreamed I was clutching at the face of a rock, but it would not

hold. Gravel gave way. I grasped for a shrub, but it pulled loose, and in cold terror I fell

into the abyss. Suddenly I realized that my fall was relative; there was no bottom and no

end. A feeling of pleasure overcame me. I realized that what I embody, the principle of

life, cannot be destroyed. It is written into the cosmic code, the order of the universe. As

I continued to fall in the dark void, embraced by the vault of the heavens, I sang to the

beauty of the stars and made my peace with the darkness.

Heinz Pagels, physicist and quantum mechanics researcher before his death in a

1988 climbing accident

The Western objective view states that after billions of years of swirling around, matter and energy evolved to create life-forms—complex self-replicating patterns of matter and energy—that became sufficiently advanced

to reflect on their own existence, on the nature of matter and energy, on their own consciousness. In

contrast, the Eastern subjective view states that consciousness came first—matter and energy are merely the

complex thoughts of conscious beings, ideas that have no reality without a thinker.

As noted above, the objective and subjective views of reality have been at odds since the dawn of recorded

history. There is often merit, however, in combining seemingly irreconcilable views to achieve a deeper

understanding. Such was the case with the adoption of quantum mechanics fifty years ago. Rather than

reconcile the views that electromagnetic radiation (for example, light) was either a stream of particles (that

is, photons) or a vibration (that is, light waves), both views were fused into an irreducible duality. While this

idea is impossible to grasp using only our intuitive models of nature, we are unable to explain the world

without accepting this apparent contradiction. Other paradoxes of quantum mechanics (for example, electron

"tunneling" in which electrons in a transistor appear on both sides of a barrier) helped create the age of computation, and may unleash a new revolution in the form of the quantum computer, [12] but more about

that later.

Once we accept such a paradox, wonderful things happen. In postulating the duality of light, quantum

mechanics has discovered an essential nexus between matter and consciousness. Particles apparently do not

make up their minds as to which way they are going or even where they have been until they are forced to do

so by the observations of a conscious observer. We might say that they appear not really to exist at all

retroactively until and unless we notice them.

So twentieth-century Western science has come around to the Eastern view. The Universe is sufficiently

sublime that the essentially Western objective view of consciousness arising from matter and the essentially

Eastern subjective view of matter arising from consciousness apparently coexist as another irreducible duality.

Clearly, consciousness, matter, and energy are inextricably linked.

We may note here a similarity of quantum mechanics to the computer simulation of a virtual world. In

today's software games that display images of a virtual world, the portions of the environment not currently

being interacted with by the user (that is, those off screen) are usually not computed in detail, if at all. The

limited resources of the computer are directed toward rendering the portion of the world that the user is

currently viewing. As the user focuses in on some other aspect, the computational resources are then

immediately directed toward creating and displaying that new perspective. It thus seems as if the portions of

the virtual world that are offscreen are nonetheless still "there" but the software designers figure there is no point wasting valuable computer cycles on regions of their simulated world that no one is watching.

I would say that quantum theory implies a similar efficiency in the physical world. Particles appear not to

decide where the have been until forced to do so by being observed. The implication is that portions of the

world we live in are not actually "rendered" until some conscious observer turns her attention toward them.

After all, there's no point wasting valuable "computes" of the celestial computer that renders our Universe.

This gives new meaning to the question about the unheard tree that falls in the forest.

‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐

ON THIS MULTIPLE‐CONSCIOUSNESS IDEA, WOULDNʹT I NOTICE THAT—I MEAN IF I HAD DECIDED TO DO

ONE THING AND THIS OTHER CONSCIOUSNESS IN MY HEAD WENT AHEAD AND DECIDED SOMETHING

ELSE?

I thought you had decided not to finish that muffin you just devoured.

TOUCHE. OKAY, IS THAT AN EXAMPLE OF WHAT YOUʹRE TALKING ABOUT?

It is a better example of Marvin Minskyʹs Society of Mind, in which he conceives of our mind as a society of other minds some like muffins, some are vain, some are health conscious, some make resolutions, others break them. Each

of these in turn is made up of other societies. At the bottom of this hierarchy are little mechanisms Minsky calls agents with little or no intelligence. It is a compelling vision of the organization of intelligence, including such phenomena as mixed emotions and conflicting values.

SOUNDS LIKE A GREAT LEGAL DEFENSE. ʺNO, JUDGE, IT WASNʹT ME. IT WAS THIS OTHER GAL IN MY

HEAD WHO DID THE DEED!ʺ

Thatʹs not going to do you much good if the judge decides to lock up the other gal in your head.

THEN HOPEFULLY THE WHOLE SOCIETY IN MY HEAD WILL STAY OUT OF TROUBLE. BUT WHICH MINDS

IN MY SOCIETY OF MIND ARE CONSCIOUS?

We could imagine that each of these minds in the society of mind is conscious, albeit that the lowest‐ranking ones have relatively little to be conscious of. Or perhaps consciousness is reserved for the higher‐ranking minds. Or perhaps only certain combinations of higher‐ranking minds are conscious, whereas others are not. Or perhaps—

NOW WAIT A SECOND, HOW CAN WE TELL WHAT THE ANSWER IS?

I believe thereʹs really no way to tell. What possible experiment can we run that would conclusively prove whether an entity or process is conscious? If the entity says, ʺHey, Iʹm really conscious,ʺ does that settle the matter? If the entity is very compelling when it expresses a professed emotion, is that definitive? if we look carefully at its internal methods and see feedback loops in which the process examines and responds to itself, does that mean itʹs conscious? If we see certain types of patterns in its neural firings, is that convincing. Contemporary philosophers such as Daniel Dermett appear to believe that the consciousness of an entity is a testable and measurable attribute. But I think science is inherently about objective reality. I donʹt see how it can break through to the subjective level.

MAYBE IF THE THING PASSES THE TURING TEST?

That is what Turing had in mind. Lacking any conceivable way of building a consciousness detector, he settled on a

practical approach, one that emphasizes our unique human proclivity for language. And I do think that Turing is right in a way—if a machine can pass a valid Turing Test, I believe that we will believe that it is conscious. Of course, thatʹs still not a scientific demonstration.

The converse proposition, however, is not compelling. Whales and elephants have bigger brains than we do and

exhibit a wide range of behaviors that knowledgeable observers consider intelligent. I regard them as conscious creatures, but they are in no position to pass the Turing Test.

THEY WOULD HAVE TROUBLE TYPING ON THESE SMALL KEYS OF MY COMPUTER.

Indeed, they have no fingers. They are also not proficient in human languages. The Turing Test is clearly a human‐

centric measurement.

IS THERE A RELATIONSHIP BETWEEN THIS CONSCIOUSNESS STUFF AND THE ISSUE OF TIME THAT WE

SPOKE ABOUT EARLIER?

Yes, we clearly have an awareness of time. Our subjective experience of time passage—and remember that subjective

is just another word for conscious—is governed by the speed of our objective processes. If we change this speed by

altering our computational substrate, we affect our perception of time.

RUN THAT BY ME AGAIN.

Letʹs take an example. If I scan your brain and nervous system with a suitably advanced noninvasive‐scanning technology of the early twenty‐first century—a very‐high‐resolution, high‐bandwidth magnetic resonance imaging, perhaps—ascertain all the salient information processes and then download that information to my suitably advanced

neural computer, Iʹll have a little you or at least someone very much like you right here in my personal computer.

If my personal computer is a neural net of simulated neurons made of electronic stuff rather than human stuff, the

version of you in my computer will run about a million times faster. So an hour for me would be a million hours for

you, which is about a century.

OH, THATʹS GREAT, YOUʹLL DUMP ME IN YOUR PERSONAL COMPUTER, AND THEN FORGET ABOUT ME

FOR A SUBJECTIVE MILLENNIUM OR TWO.

Weʹll have to be careful about that, wonʹt we.

‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐

C H A P T E R F O U R

A NEW FORM OF

INTELLIGENCE ON EARTH

THE ARTIFICIAL INTELLIGENCE MOVEMENT

What if these theories are really true, and we were magically shrunk and put into someoneʹs brain while he was thinking. We would see all the pumps, pistons, gears and levers working away, and we would be able to describe their workings completely, in mechanical terms, thereby completely describing the thought processes of the brain. But that description would nowhere contain any mention of thought! it would contain nothing but descriptions of pumps, pistons, levers!

—Gottfried Wilhelm Leibniz

Artificial stupidity (AS) may be defined as the attempt by computer scientists to create computer programs capable of causing problems of a type normally associated with human thought.

—Wallace Marshal

Artificial intelligence (AI) is the science of how to get machines to do the things they do in the movies.

—Astro Teller

The Ballad of Charles and Ada

Returning to the evolution of intelligent machines, we find Charles Babbage sitting in the rooms of the Analytical Society at Cambridge, England, in 1821, with a table of logarithms lying before him.

ʺWell, Babbage, what are you dreaming about?ʺ asked another member, seeing Babbage half asleep.

ʺI am thinking that all these tables might be calculated by machinery!ʺ Babbage replied.

From that moment on, Babbage devoted most of his waking hours to an unprecedented vision: the worldʹs first

programmable computer. Although based entirely on the mechanical technology of the nineteenth century, Babbageʹs

ʺAnalytical Engineʺ was a remarkable foreshadowing of the modern computer. [1]

Babbage developed a liaison with the beautiful Ada Lovelace, the only legitimate child of Lord Byron, the poet.

She became as obsessed with the project as Babbage, and contributed many of the ideas for programming the

machine, including the invention of the programming loop and the subroutine. She was the worldʹs first software

engineer, indeed the only software engineer prior to the twentieth century.

Lovelace significantly extended Babbageʹs ideas and wrote a paper on programming techniques, sample

programs, and the potential of this technology to emulate intelligent human activities. She describes the speculations of Babbage and herself on the capacity of the Analytical Engine, and future machines like it, to play chess and

compose music. She finally concludes that although the computations of the Analytical Engine could not properly be

regarded as ʺthinking,ʺ they could nonetheless perform activities that would otherwise require the extensive

application of human thought.

The story of Babbage and Lovelace ends tragically. She died a painful death from cancer at the age of thirty‐six,

leaving Babbage alone again to pursue his quest. Despite his ingenious constructions and exhaustive effort, the

Analytical Engine was never completed. Near the end of his existence he remarked that he had never had a happy

day in his life. Only a few mourners were recorded at Babbageʹs funeral in 1871. [2]

What did survive were Babbageʹs ideas. The first American programmable computer, the Mark 1, completed in

1944 by Howard Aiken of Harvard University and IBM, borrowed heavily from Babbageʹs architecture. Aiken

commented, ʺIf Babbage had lived seventy‐five years later, I would have been out of a job.ʺ [3] Babbage and Lovelace were innovators nearly a century ahead of their time. Despite Babbageʹs inability to finish any of his major initiatives, their concepts of a computer with a stored program, self‐modifying code, addressable memory, conditional

branching, and computer programming itself still form the basis of computers today. [4]

Again, Enter Alan Turing

By 1940, Hitler had the mainland of Europe in his grasp, and England was preparing for an anticipated invasion. The

British government organized its best mathematicians and electrical engineers, under the intellectual leadership of Alan Turing, with the mission of cracking the German military code. It was recognized that with the German air force enjoying superiority in the skies, failure to accomplish this mission was likely to doom the nation. In order not to be distracted from their task, the group lived in the tranquil pastures of Hertfordshire, England.

Turing and his colleagues constructed the worldʹs first operational computer from telephone relays and named it

Robinson, [5] after a popular cartoonist who drew ʺRube Goldbergʺ machines (very ornate machinery with many interacting mechanisms). The groupʹs own Rube Goldberg succeeded brilliantly and provided the British with a transcription of nearly all significant Nazi messages. As the Germans added to the complexity of their code (by adding additional coding wheels to their Enigma coding machine), Turing replaced Robinsonʹs electro‐magnetic intelligence with an electronic version called Colossus built from two thousand radio tubes. Colossus and nine similar machines running in parallel provided an uninterrupted decoding of vital military intelligence to the Allied war effort.

Use of this information required supreme acts of discipline on the part of the British government. Cities that were

to be bombed by Nazi aircraft were not forewarned, lest preparations arouse German suspicions that their code had

been cracked. The information provided by Robinson and Colossus was used only with the greatest discretion, but the cracking of Enigma was enough to enable the Royal Air Force to win the Battle of Britain.

Thus fueled by the exigencies of war, and drawing upon a diversity of intellectual traditions, a new form of intelligence emerged on Earth.

The Birth of Artificial Intelligence

The similarity of the computational process to the human thinking process was not lost on Turing. In addition to having established much of the theoretical foundations of computation and having invented the first operational computer, he was instrumental in the early efforts to apply this new technology to the emulation of intelligence.

In his classic 1950 paper, Computing Machinery and Intelligence, Turing described an agenda that would in fact occupy the next half century of advanced computer research: game playing, decision making, natural language understanding, translation, theorem proving, and, of course, encryption and the cracking of codes. [6] He wrote (with his friend David Champernowne) the first chess‐playing program.

As a person, Turing was unconventional and extremely sensitive. He had a wide range of unusual interests, from

the violin to morphogenesis (the differentiation of cells). There were public reports of his homosexuality, which greatly disturbed him, and he died at the age of forty‐one, a suspected suicide.

The Hard Things Were Easy

In the 1950s, progress came so rapidly that some of the early pioneers felt that mastering the functionality of the human brain might not be so difficult after all. In 1956, AI researchers Allen Newell, J. C. Shaw, and Herbert Simon created a program called Logic Theorist (and in 1957 a later version called General Problem Solver), which used recursive search techniques to solve problems in mathematics. [7] Recursion, as we will see later in this chapter, is a powerful method of defining a solution in terms of itself. Logic Theorist and General Problem Solver were able to find proofs for many of the key theorems in Bertrand Russell and Alfred North Whiteheadʹs seminal work on set theory,

Principia Mathematica, [8] including a completely original proof for an important theorem that had never been previously solved. These early successes led Simon and Newell to say in a 1958 paper, entitled Heuristic Problem Solving: The Next Advance in Operations Research, ʺThere are now in the world machines that think, that learn and that create. Moreover, their ability to do these things is going to increase rapidly until—in a visible future—the range of problems they can handle will be coextensive with the range to which the human mind has been applied.ʺ [9] The paper goes on to predict that within ten years (that is, by 1968) a digital computer would be the world chess champion. A decade later, an unrepentant Simon predicts that by 1985, ʺmachines will be capable of doing any work

that a man can do.ʺ Perhaps Simon was intending a favorable comment on the capabilities of women, but these predictions, decidedly more optimistic than Turingʹs, embarrassed the nascent AI field.

The field has been inhibited by this embarrassment to this day, and AI researchers have been reticent in their prognostications ever since. In 1997, when Deep Blue defeated Gary Kasparov, then the reigning human world chess

champion, one prominent professor commented that all we had learned was that playing a championship game of chess does not require intelligence after all. [10] The implication is that capturing real intelligence in our machines remains far beyond our grasp. While I donʹt wish to overstress the significance of Deep Blueʹs victory, I believe that from this perspective we will ultimately find that there are no human activities that require ʺrealʺ intelligence.

During the 1960s, the academic field of AI began to flesh out the agenda that Turing had described in 1950, with

encouraging or frustrating results, depending on your point of view. Daniel G. Bobrowʹs program Student could solve algebra problems from natural English‐language stories and reportedly did well on high‐school math tests. [11]

The same performance was reported for Thomas G. Evansʹs Analogy program for solving IQ‐test geometric‐analogy

problems. [12] The field of expert systems was initiated with Edward A. Feigenbaumʹs DENDRAL, which could answer questions about chemical compounds. [13] And natural‐language understanding got its start with Terry Winogradʹs SHRDLU, which could understand any meaningful English sentence, so long as you talked about colored

blocks. [14]

The notion of creating a new form of intelligence on Earth emerged with an intense and often uncritical passion

simultaneously with the electronic hardware on which it was to be based. The unbridled enthusiasm of the fieldʹs early pioneers also led to extensive criticism of these early programs for their inability to react intelligently in a variety of situations. Some critics, most notably existentialist philosopher and phenomenologist Hubert Dreyfus, predicted that machines would never match human levels of skill in areas ranging from the playing of chess to the writing of books about computers.

It turned out that the problems we thought were difficult solving mathematical theorems, playing respectable games of chess, reasoning within domains such as chemistry and medicine were easy, and the multi‐thousand-instructions‐per‐second computers of the 1950s and 1960s were often adequate to provide satisfactory results. What

proved elusive were the skills that any five‐year‐old child possesses: telling the difference between a dog and a cat, or understanding an animated cartoon. Weʹll talk more about why the easy problems are hard in Part II.

Waiting for Real Artificial Intelligence

The 1980s saw the early commercialization of artificial intelligence with a wave of new AI companies forming and going public. Unfortunately, many made the mistake of concentrating on a powerful but inherently inefficient interpretive language called LISP, which had been popular in academic AI circles. The commercial failure of LISP and the AI companies that emphasized it created a backlash. The field of AI started shedding its constituent disciplines, and companies in natural‐language understanding, character and speech recognition, robotics, machine vision, and other areas originally considered part of the AI discipline now shunned association with the fieldʹs label.

Machines with sharply focused intelligence nonetheless became increasingly pervasive. By the mid‐1990s, we saw

the infiltration of our financial institutions by systems using powerful statistical and adaptive techniques. Not only were the stock, bond, currency, commodity, and other markets managed and maintained by computerized networks,

but the majority of buy‐and‐sell decisions were initiated by software programs that contained increasingly sophisticated models of their markets. The 1987 stock market crash was blamed in large measure on the rapid interaction of trading programs. Trends that otherwise would have taken weeks to manifest themselves developed in

minutes. Suitable modifications to these algorithms have managed to avoid a repeat performance.

Since 1990, the electrocardiogram (EKG) has come complete with the computerʹs own diagnosis of oneʹs cardiac health. Intelligent image‐processing programs enable doctors to peer deep into our bodies and brains, and computerized bioengineering technology enables drugs to be designed on biochemical simulators. The disabled have

been particularly fortunate beneficiaries of the age of intelligent machines. Reading machines have been reading to blind and dyslexic persons since the 1970s, and speech‐recognition and robotic devices have been assisting hands-disabled individuals since the 1980s.

Perhaps the most dramatic public display of the changing values of the age of knowledge took place in the military. We saw the first effective example of the increasingly dominant role of machine intelligence in the Gulf War of 1991. The cornerstones of military power from the beginning of recorded history through most of the twentieth century—geography, manpower, firepower, and battle‐station defenses—have been largely replaced by the

intelligence of software and electronics. Intelligent scanning by unstaffed airborne vehicles, weapons finding their way to their destinations through machine vision and pattern recognition, intelligent communications and coding protocols, and other manifestations of the information age have transformed the nature of war.

Invisible Species

With the increasingly important role of intelligent machines in all phases of our lives—military, medical, economic and financial, political—it is odd to keep reading articles with titles such as Whatever Happened to Artificial Intelligence?

This is a phenomenon that Turing had predicted: that machine intelligence would become so pervasive, so comfortable, and so well integrated into our information‐based economy that people would fail even to notice it.

It reminds me of people who walk in the rain forest and ask, ʺWhere are all these species that are supposed to live

here?ʺ when there are several dozen species of ant alone within fifty feet of them. Our many species of machine intelligence have woven themselves so seamlessly into our modern rain forest that they are all but invisible.

Turing offered an explanation of why we would fail to acknowledge intelligence in our machines. In 1947, he wrote: ʺThe extent to which we regard something as behaving in an intelligent manner is determined as much by our

own state of mind and training as by the properties of the object under consideration. If we are able to explain and predict its behavior we have little temptation to imagine intelligence. With the same object, therefore, it is possible that one man would consider it as intelligent and another would not; the second man would have found out the rules

of its behavior.ʺ

I am also reminded of Elaine Richʹs definition of artificial intelligence, as the ʺstudy of how to make computers do things at which, at the moment, people are better.ʺ

It is our fate as artificial intelligence researchers never to reach the carrot dangling in front of us. Artificial intelligence is inherently defined as the pursuit of difficult computer‐science problems that have not yet been solved.

THE FORMULA FOR INTELLIGENCE

The computer programmer is a creator of universes for which he alone is the lawgiver . . . No playwright, no stage director no emperor however powerful, has ever exercised such absolute authority to arrange a stage or a field of battle and to command such unswervingly dutiful actors or troops.

—Joseph Weizenbaum

A beaver and another forest animal are contemplating an immense man‐made dam. The beaver is saying something

like ʺNo, I didnʹt actually build it. But itʹs based on an idea of mine.ʺ

—Edward Fredkin

Simple things should be simple; complex things should be possible.

—Alan Kay

What Is Intelligence?

A goal may be survival—evade a foe, forage for food, find shelter. Or it might be communication—relate an experience, evoke a feeling. Or perhaps it is to partake in a pastime—play a board game, solve a puzzle, catch a ball.

Sometimes it is to seek transcendence—create an image, compose a passage. A goal may be well defined and unique,

as in the solution to a math problem. Or it may be a personal expression with no clearly right answer.

My view is that intelligence is the ability to use optimally limited resources—including time—to achieve such goals. There is a plethora of other definitions. One of my favorites is by R. W. Young, who defines intelligence as ʺthat faculty of mind by which order is perceived in a situation previously considered disordered. [15] For this definition, we will find the paradigms discussed below quite apropos.

Intelligence rapidly creates satisfying, sometimes surprising plans that meet an array of constraints. The products

of intelligence may be clever, ingenious, insightful, or elegant. Sometimes, as in the case of Turingʹs solution to cracking the Enigma code, an intelligent solution exhibits all of these qualities. Modest tricks may accidentally produce an intelligent answer from time to time, but a true intelligent process that reliably creates intelligent solutions inherently goes beyond a mere recipe. Clearly, no simple formula can emulate the most powerful phenomenon in the Universe: the complex and mysterious process of intelligence.

Actually, thatʹs wrong. All that is needed to solve a surprisingly wide range of intelligent problems is exactly this: simple methods combined with heavy doses of computation (itself a simple process, as Alan Turing demonstrated in

1936 with his conception of the Turing Machine, [16] an elegant model of computation) and examples of the problem.

In some cases, we donʹt even need the latter; just one well‐defined statement of the problem will do.

How far can we go with simple paradigms? Is there a class of intelligent problems amenable to simple approaches, with another, more penetrating class that lies beyond its grasp? It turns out that the class of problems solvable with simple approaches is extensive. Ultimately, with sufficient computational brute force (which will be ample in the twenty‐first century) and the right formulas in the right combination, there are few definable problems that fail to yield. Except perhaps for this problem: What is the complete set of unifying formulas that underlies intelligence?

Evolution determined an answer to this problem in a few billion years. Weʹve made a good start in a few thousand

years. We are likely to finish the job in a few more decades.

These methods, described briefly below, are discussed in more detail in the supplementary section in the back of

this book ʺHow to Build an Intelligent Machine in Three Easy Paradigms.ʺ

Letʹs take a look at a few plain yet powerful paradigms. With a little practice, you, too, can build intelligent machines.

The Recursive Formula: Just Carefully State the Problem

A recursive procedure is one that calls itself. Recursion is a useful approach to generating all of the possible solutions to a problem, or, in the context of a game such as chess, all of the possible move‐countermove sequences.

Consider the game of chess. We construct a program called ʺPick Best Moveʺ to select each move. Pick Best Move

starts by listing all of the possible moves from the current state of the board. This is where the careful statement of the problem comes in, because to generate all of the possible moves we need to precisely consider the rules of the game.

For each move, the program constructs a hypothetical board that reflects what would happen if we made this move.

For each such hypothetical board, we now need to consider what our opponent would do if we made this move. Now

recursion comes in, because Pick Best Move simply calls Pick Best Move (that is, itself) to pick the best move for our opponent. In calling itself, Pick Best Move then lists all of the legal moves for our opponent.

The program keeps calling itself, looking ahead to as many moves as we have time to consider, which results in

the generation of a huge move‐countermove tree. This is another example of exponential growth, because to look ahead an additional half‐move requires multiplying the amount of available computation by about five.

Key to the recursive formula is pruning this huge tree of possibilities, and ultimately stopping the recursive growth of the tree. In the game context, if a board looks hopeless for either side, the program can stop the expansion of the move‐countermove tree from that point (called a ʺterminal leafʺ of the tree), and consider the most recently considered move to be a likely win or loss.

When all of these nested program calls are completed, the program will have determined the best possible move

for the current actual board, within the limits of the depth of recursive expansion that it had time to pursue.

The recursive formula was good enough to build a machine—a specially designed IBM supercomputer—that

defeated the world chess champion (although Deep Blue does augment the recursive formula with databases of moves from most of the grand‐master games of this century). Ten years ago, in the Age of Intelligent Machines, I noted that while the best chess computers were gaining in chess ratings by forty‐five points a year, the best humans were

advancing by closer to zero points. That put the year in which a computer would beat the world chess champion at

1998, which turned out to be overly pessimistic by one year. Hopefully my predictions in this book will be more accurate. [17]

Our simple recursive rule plays a world‐class game of chess. A reasonable question, then, is, What else can it do?

We certainly can replace the module that generates chess moves with a module programmed with the rules of another game. Stick in a module that knows the rules of checkers, and you can also beat just about any human.

Recursion is really good at backgammon. Hans Berlinerʹs program defeated the human backgammon champion with

the slow computers we had back in 1980. [18]

The recursive formula is also a rather good mathematician. Here the goal is to solve a mathematical problem, such

as proving a theorem. The rules then become the axioms of the field of math being addressed, as well as previously

proved theorems. The expansion at each point, is the possible axioms (or previous proved theorems) that can be applied to a proof at each step. This was the approach used by Allen Newell, J. C. Shaw, and Herbert Simon for their 1957 General Problem Solver. Their program outdid Russell and Whitehead on some hard math problems, and thereby fueled the early optimism of the artificial intelligence field.

From these examples, it may appear that recursion is well suited only for problems in which we have crisply defined rules and objectives. But it has also shown promise in computer generation of artistic creations. Ray Kurzweilʹs Cybernetic Poet, for example, uses a recursive approach. [19] The program establishes a set of goals for each word‐achieving a certain rhythmic pattern, poem structure, and word choice that is desirable at that point in the poem. If the program is unable to find a word that meets these criteria, then it backs up and erases the previous word it has written, re‐establishes the criteria it had originally set for the word just erased, and goes from there. If that also leads to a dead end, it backs up again. It thus goes backward and forward, hopefully making up its ʺmindʺ at some

point. Eventually, it forces itself to make up its mind by relaxing some of the constraints if all paths lead to dead ends.

After all, no one will ever know if it breaks its own rules.

Recursion is also popular in programs that compose music. [20] In this case the ʺmovesʺ are well defined. We call

them notes, which have properties such as pitch, duration, loudness, and playing style. The objectives are less easy to come by but are still feasible by defining them in terms of rhythmic and melodic structures. The key to recursive artistic programs is how we define the terminal leaf evaluation. Simple approaches do not always work well here, and some of the cybernetic art and music programs we will talk about later use complex methods to evaluate the terminal

leaves. While we have not yet captured all of intelligence in a simple formula, we have made a lot of progress with

this simple combination: recursively defining a solution through a precise statement of the problem and massive computation. For many problems, a personal computer circa end of the twentieth century is massive enough.

Neural Nets: Self‐Organization and Human Computing

The neural net paradigm is an attempt to emulate the computing structure of neurons in the human brain. We start

with a set of inputs that represents a problem to be solved. [21] For example, the input may be a set of pixels representing an image that needs to be identified. These inputs are randomly wired to a layer of simulated neurons.

Each of these simulated neurons can be simple computer programs that simulate a model of a neuron in software, or

they can be electronic implementations.

Each point of the input (for example, each pixel in an image) is randomly connected to the inputs of the first layer of simulated neurons. Each connection has an associated synaptic strength that represents the importance of this connection. These strengths are also set at random values. Each neuron adds up the signals coming into it. If the combined signal exceeds a threshold, then the neuron fires and sends a signal to its output connection. If the combined input signal does not exceed the threshold, then the neuron does not fire and its output is zero. The output of each neuron is randomly connected to the inputs of the neurons in the next layer. At the top layer, the output of one or more neurons, also randomly selected, provides the answer.

A problem, such as an image of a printed character to be identified, is presented to the input layer, and the output neurons produce an answer. And the responses are remarkably accurate for a wide range of problems.

Actually, the answers are not accurate at all. Not at first, anyway. Initially, the output is completely random. What else would you expect, given that the whole system is set up in a completely random fashion?

I left out an important step, which is that the neural net needs to learn its subject matter. Like the mammalian brains on which it is modeled, a neural net starts out ignorant. The neural netʹs teacher, which may be a human, a computer program, or perhaps another, more mature neural net that has already learned its lessons, rewards the student neural net when it is right and punishes it when it is wrong. This feedback is used by the student neural net to adjust the strengths of each interneuronal connection. Connections that were consistent with the right answer are made stronger. Those that advocated a wrong answer are weakened. Over time, the neural net organizes itself to provide the right answers without coaching. Experiments have shown that neural nets can learn their subject matter

even with unreliable teachers. It the teacher is correct only 60 percent of the time, the student neural net will still learn its lessons.

If we teach the neural net well, this paradigm is powerful and can emulate a wide range of human pattern-recognition faculties. Character‐recognition systems using multilayer neural nets come very close to human performance in identifying sloppily handwritten print. [22] Recognizing human faces has long been thought to be an

impressive human task beyond the capabilities of a computer, yet there are now automated check‐cashing machines,

using neural net software developed by a small New England company called Miros, that verify the identity of the customer by recognizing his or her face. [23] Donʹt try to fool these machines by holding someone elseʹs picture over your face—the machine takes a three‐dimensional picture of you using two cameras. The machines are evidently reliable enough that the banks are willing to have users walk away with real cash.

Neural nets have been applied to medical diagnoses. Using a system called Brainmaker, from California Scientific

Software, doctors can quickly recognize heart attacks from enzyme data, and classify cancer cells from images. Neural nets are also adept at prediction—LBS Capital Management uses Brainmakerʹs neural nets to predict the Standard & Poorʹs 500. [24] Their ʺone day aheadʺ and ʺone week aheadʺ predictions have consistently outperformed traditional,

formula‐based methods.

There is a variety of self‐organizing methods in use today that are mathematical cousins of the neural net model

discussed above. One of these techniques, called Markov models, is widely used in automatic speech‐recognition systems. Today, such systems can accurately understand humans speaking a vocabulary of up to sixty thousand words spoken in a natural continuous manner.

Whereas recursion is proficient at searching through vast combinations of possibilities, such as sequences of chess

moves, the neural network is a method of choice for recognizing patterns. Humans are far more skilled at recognizing patterns than in thinking through logical combinations, so we rely on this aptitude for almost all of our mental processes. Indeed, pattern recognition comprises the bulk of our neural circuitry. These faculties make up for the extremely slow speed of human neurons. The reset time on neural firing is about five milliseconds, permitting only

about two hundred calculations per second in each neural connection. [25] We donʹt have time, therefore, to think too many new thoughts when we are pressed to make a decision. The human brain relies on precomputing its analyses

and storing them for future reference. We then use our pattern‐recognition capability to recognize a situation as comparable to one we have thought about and then draw upon our previously considered conclusions. We are unable

to think about matters that we have not thought through many times before.

Destruction of Information: The Key to Intelligence

There are two types of computing transformations, one in which information is preserved and one in which information is destroyed. An example of the former is multiplying one number by another constant number other than zero. Such a conversion is reversible: just divide by the constant and you get back the original number. If, on the other hand, we multiply a number by zero, then the original information cannot be restored. We canʹt divide by zero

to get the original number back because zero divided by zero is indeterminate. Therefore, this type of transformation destroys its input.

This is another example of the irreversibility of time (the first was the Law of Increasing Entropy) because there is no way to reverse an information‐destroying computation.

The irreversibility of computation is often cited as a reason that computation is useful: It transforms information

in a unidirectional, ʺpurposefulʺ manner. Yet the reason that computation is irreversible is based on its ability to destroy information, not to create it. The value of computation is precisely in its ability to destroy information selectively. For example, in a pattern‐recognition task such as recognizing faces or speech sounds, preserving the information‐bearing features of a pattern while ʺdestroyingʺ the enormous flow of data in the original image or sound is essential to the process. Intelligence is precisely this process of selecting relevant information carefully so that it can skillfully and purposefully destroy the rest.

That is exactly what the neural net paradigm accomplishes. A neuron—human or machine—receives hundreds or

thousands of continuous signals representing a great deal of information. In response to this, the neuron either fires or does not fire, thereby reducing the babble of its input to a single bit of information. Once the neural net has been well trained, this reduction of information is purposeful, useful, and necessary.

We see this paradigm—reducing enormous streams of complex information into a single response of yes or no—at

many levels in human behavior and society. Consider the torrent of information that flows into a legal trial. The outcome of all this activity is essentially a single bit of information—guilty or not guilty, plaintiff or defendant. A trial may involve a few such binary decisions, but my point is unaltered. These simple yes‐or‐no results then flow into other decisions and implications. Consider an election—same thing—each of us receives a vast flow of data (not all of it pertinent, perhaps) and renders a 1‐bit decision: incumbent or challenger. That decision then flows in with similar decisions from millions of other voters and the final tally is again a single bit of data.

There is too much raw data in the world to continue to keep all of it around. So we continually destroy most of it,

feeding those results to the next level. This is the genius behind the all‐or‐nothing firing of the neuron.

Next time you do some spring cleaning and attempt to throw away old objects and files, you will know why this

is so difficult—the purposeful destruction of information is the essence of intelligent work.

How to Catch a Fly Ball

When a batter hits a fly ball, it follows a path that can be predicted from the ballʹs initial trajectory, spin, and speed, as well as wind conditions. The outfielder, however, is unable to measure any of these properties directly and has to infer them from his angle of observation. To predict where the ball will go, and where the fielder should also go, would appear to require the solution of a rather overwhelming set of complex simultaneous equations. These equations need to be constantly recomputed as new visual data streams in. How does a ten‐year‐old Little Leaguer accomplish this, with no computer, no calculator, no pen and paper, having taken no calculus classes, and having only a few seconds of time?

The answer is, she doesnʹt. She uses her neural netsʹ pattern‐recognition abilities, which provide the foundation for much of skill formation. The neural nets of the ten‐year‐old have had a lot of practice in comparing the observed flight of the ball to her own actions. Once she has learned the skill, it becomes second nature, meaning that she has no idea how she does it. Her neural nets have gained all the insights needed: Take a step back if the ball has gone above my field of view; take a step forward if the ball is below a certain level in my field of view and no longer rising, and so on. The human ballplayer is not mentally computing equations. Nor is there any such computation going on unconsciously in

the playerʹs brain. What is going on is pattern recognition, the foundation of most human thought.

One key to intelligence is knowing what not to compute. A successful person isnʹt necessarily better than her less

successful peers at solving problems; her pattern‐recognition facilities have just learned what problems are worth solving.

Building Silicon Nets

Most computer‐based neural net applications today simulate their neuron models in software. This means that computers are simulating a massively parallel process on a machine that does only one calculation at a time. Todayʹs neural net software running on inexpensive personal computers can emulate about a million neuron connection calculations per second, which is more than a billion times slower than the human brain (although we can improve on

this figure significantly by coding directly in the computerʹs machine language). Even so, software using a neural net paradigm on personal computers circa end of the twentieth century comes very close to matching human ability in such tasks as recognizing print, speech, and faces.

There is a genre of neural computer hardware that is optimized for running neural nets. These systems are modestly, not massively, parallel and are about a thousand times faster than neural net software on a personal computer. Thatʹs still about a million times slower than the human brain.

There is an emerging community of researchers who intend to build neural nets the way nature intended: massively parallel, with a dedicated little computer for each neuron. The Advanced Telecommunications Research Lab (ATR), a prestigious research facility in Kyoto, Japan, is building such an artificial brain with a billion electronic neurons. Thatʹs about 1 percent of the number in the human brain, but these neurons will run at electronic speeds, which is about a million times faster than human neurons. The overall computing speed of ATRʹs artificial brain will be, therefore, thousands of times greater than the human brain. Hugo de Garis, director of ATRʹs Brain Builder Group, hopes to educate his artificial brain in the basics of human language and then set the device free to read—at electronic speeds—all the literature on the Web that interests it. [26]

Does the simple neuron model we have been discussing match the way human neurons work? The answer is yes

and no. On the one hand, human neurons are more complex and more varied than the model suggests. The connection strengths are controlled by multiple neurotransmitters and are not sufficiently characterized by a single number. The brain is not a single organ, but a collection of hundreds of specialized information‐processing organs, each having different topologies and organizations. On the other hand, as we begin to examine the parallel algorithms behind the neural organization in different regions, we find that much of the complexity of neuron design and structure has to do with supporting the neuronʹs life processes and is not directly relevant to the way it handles information. The salient computing methods are relatively straightforward, although varied. For example, a vision chip developed by researcher Carver Mead appears to realistically capture the early stages of human image processing. [27] Although the methods of this and other similar chips differ in a number of respects from the neuron models discussed above, the methods are understood and readily implemented in silicon. Developing a catalog of the

basic paradigms that the neural nets in our brain are using—each relatively simple in its own way—will represent a

great advance in our understanding of human intelligence and in our ability to re‐create and surpass it.

The Search for Extra Terrestrial Intelligence (SETI) project is motivated by the idea that exposure to the intelligent designs of intelligent entities that evolved elsewhere will provide a vast resource to advancing scientific understanding. [28] But we have an impressive and poorly understood piece of intelligent machinery right here on Earth. One such entity—this author—is no more than three feet from the notebook computer to which I am dictating

this book. [29] We can—and will—learn a lot by probing its secrets.

Evolutionary Algorithms: Speeding Up Evolution a Millionfold

Hereʹs an investment tip: Before you invest in a company, be sure to check the track record of the management, the

stability of its balance sheet, the companyʹs earnings history, relevant industry trends, and analyst opinions. On second thought, thatʹs too much work. Hereʹs a simpler approach:

First randomly generate (on your personal computer, of course) a million sets of rules for making investment decisions. Each set of rules should define a set of triggers for buying and selling stocks (or any other security) based on available financial data. This is not hard, as each set of rules does not need to make a lot of sense. Embed each set of rules in a simulated software ʺorganismʺ with the rules encoded in a digital ʺchromosome.ʺ Now evaluate each simulated organism in a simulated environment by using real‐world financial data—youʹll find plenty on the Web.

Let each software organism invest some simulated money and see how it fares based on actual historic data. Allow

the ones that do a bit better than industry averages to survive into the next generation. Kill off the rest (sorry). Now have each of the surviving ones multiply themselves until weʹre back to a million such creatures. As they multiply,

allow some mutation (random change) in the chromosomes to occur. Okay, thatʹs one generation of simulated evolution. Now repeat these steps for another hundred thousand generations. At the end of this process, the surviving software creatures should be darn smart investors. After all, their methods have survived for a hundred thousand generations of evolutionary pruning.

In the real world, a number of successful investment funds now believe that the surviving ʺcreaturesʺ from just such a simulated evolution are smarter than mere human financial analysts. State Street Global Advisors, which manages $3.7 trillion in funds, has made major investments in applying both neural nets and evolutionary algorithms

to making purchase‐and‐sale decisions. This includes a majority stake in Advanced Investment Technologies, which

runs a successful fund in which buy‐and‐sell decisions are made by a program combining these methods. [30]

Evolutionary and related techniques guide a $95 billion fund managed by Barclayʹs Global Investors, as well as funds run by Fidelity and Panagora Asset Management.

The above paradigm is called an evolutionary (sometimes called genetic) algorithm. [31] The system designers donʹt directly program a solution; they let one emerge through an iterative process of simulated competition and improvement. Recall that evolution is smart but slow, so to enhance its intelligence we retain its discernment while greatly speeding up its ponderous pace. The computer is fast enough to simulate thousands of generations in a matter of hours or days or weeks. But we have only to go through this iterative process one time. Once we have let this simulated evolution run its course, we can apply the evolved and highly refined rules to real problems in a rapid fashion.

Like neural nets, evolutionary algorithms are a way of harnessing the subtle but profound patterns that exist in chaotic data. The critical resource required is a source of many examples of the problem to be solved. With regard to the financial world, there is certainly no lack of chaotic information—every second of trading is available online.

Evolutionary algorithms are adept at handling problems with too many variables to compute precise analytic solutions. The design of a jet engine, for example, involves more than one hundred variables and requires satisfying dozens of constraints. Evolutionary algorithms used by researchers at General Electric were able to come up with engine designs that met the constraints more precisely than conventional methods.

Evolutionary algorithms, part of the field of chaos or complexity theory, are increasingly used to solve otherwise

intractable business problems. General Motors applied an evolutionary algorithm to coordinate the painting of its cars, which reduced expensive color changeovers (in which a painting booth is put out of commission to change paint

color) by 50 percent. Volvo uses them to plan the intricate schedules for manufacturing the Volvo 770 truck cab.

Cemex, a $3 billion cement company, uses a similar approach to determining its complex delivery logistics. This approach is increasingly supplanting more analytic methods throughout industry.

This paradigm is also adept at recognizing patterns. Contemporary genetic algorithms that recognize fingerprints,

faces, and hand‐printed characters reportedly outperform neural net approaches. It is also a reasonable way to write computer software, particularly software that needs to find delicate balances for competing resources. One well-known example is Microsoftʹs Windows 95, which contains software to balance system resources that was evolved rather than explicitly written by human programmers.

With evolutionary algorithms, you have to be careful what you ask for. John Koza describes an evolutionary program that was asked to solve a problem involving the stacking of blocks. The program evolved a solution that perfectly fit all of the problem constraints, except that it involved 2,319 block movements, far more than was practical.

Apparently, the program designers had neglected to specify that minimizing the number of block movements was desirable. Koza commented that ʺgenetic programming gave us exactly what we asked for; no more and no less.ʺ

Self‐Organization

Neural nets and evolutionary algorithms are considered self‐organizing ʺemergentʺ methods because the results are

not predictable and indeed are often surprising to the human designers of these systems. The process that such self‐

organizing programs go through in solving a problem is often unpredictable. For example, a neural net or evolutionary algorithm may go through hundreds of iterations making apparently little progress, and then suddenly—as if the process had a flash of inspiration—things click and a solution quickly emerges.

Increasingly, we will be building our intelligent machines by breaking complex problems (such as understanding

human language) into smaller subtasks, each with its own self‐organizing program. Such layered emergent systems

will have softer edges in the boundaries of their expertise and will display greater flexibility in dealing with the inherent ambiguity of the real world.

The Holographic Nature of Human Memory

The holy grail in the field of knowledge acquisition is to automate the learning process, to let machines go out into the world (or, for starters, out onto the Web) and gather knowledge on their own. This is essentially what, the ʺchaos theoryʺ methods—neural nets, evolutionary algorithms and their mathematical cousins—permit. Once these methods

have converged on an optimal solution, the patterns of neural connection strengths or evolved digital chromosomes

represent a form of knowledge to be stored for future use.

Such knowledge is, however, difficult to interpret. The knowledge embedded in a software neural net that has been trained to recognize human faces consists of a network topology and a pattern of neural connection strengths. It does a great job of recognizing Sallyʹs face, but there is nothing explicit that explains that she is recognizable because of her deep‐set eyes and narrow, upturned nose. We can train a neural net to recognize good middle‐game chess moves, but it will likewise be unable to explain its reasoning.

The same is true for human memory. There is no little data structure in our brains that records the nature of a chair as a horizontal platform with multiple vertical posts and an optional vertical backrest. Instead, our many thousands of experiences with chairs are diffusely represented in our own neural nets. We are unable to recall every experience we have had with a chair but each encounter has left its impression on the pattern of neuron‐connection

strengths reflecting our knowledge of chairs. Similarly, there is no specific location in our brain in which a friendʹs face is stored. It is remembered as a distributed pattern of synaptic strengths.

Although we do not yet understand the precise mechanisms responsible for human memory—and the design is

likely to vary from region to region of the brain—we do know that for most human memory, the information is distributed throughout the particular brain region. If you have ever played with a visual hologram, you will appreciate the benefits of a distributed method of storing and organizing information. A hologram is a piece of film containing an interference pattern caused by the interaction of two sets of light waves. One wave front comes from a scene illuminated by a laser light. The other comes directly from the same laser. If we illuminate the hologram, it re-creates a wave front of light that is identical to the light waves that came from the original objects. The impression is that we are viewing the original three‐dimensional scene. Unlike an ordinary picture, if a hologram is cut in half, we do not end up with half the picture, but still have the entire picture, only at half the resolution. We can say that the entire picture exists at every point, albeit at zero resolution. If you scratch a hologram, it has virtually no effect because the resolution is insignificantly reduced. No scratches are visible in the reconstructed three‐dimensional image that a scratched hologram produces. The implication is that a hologram degrades gracefully.

The same holds true for human memory. We lose thousands of nerve cells every hour, but it has virtually no effect

because of the highly distributed nature of all of our mental processes. [32] None of our individual brain cells is all that important there is no Chief Executive Officer neuron.

Another implication of storing a memory as a distributed pattern is that we have little or no understanding of how

we perform most of our recognition tasks and skills. When playing baseball, we sense that we should step back when

the ball goes over our field of view, but most of us are unable to articulate this implicit rule that is diffusely encoded in our fly‐ball‐catching neural net.

There is one brain organ that is optimized for understanding and articulating logical processes, and that is the outer layer of the brain, called the cerebral cortex. Unlike the rest of the brain, this relatively recent evolutionary development is rather flat, only about one eighth of an inch thick, and includes a mere 8 million neurons. [33] This elaborately folded organ provides us with what little competence we do possess for understanding what we do and

how we do it.

There is current debate on the methods used by the brain for long‐term retention of memory. Whereas our recent

sense impressions and currently active recognition abilities and skills appear to be encoded in a distributed pattern of synaptic strengths, our longer‐term memories may be chemically encoded in either the ribonucleic acid (RNA) or in

peptides, chemicals similar to hormones. Even if there is chemical encoding of long‐term memories, they nonetheless

appear to share the essential holographic attributes of our other mental processes.

In addition to the difficulty of understanding and explaining memories and insights that are represented only as

distributed patterns (which is true for both human and machine), another challenge is providing the requisite experiences from which to learn. For humans, this is the mission of our educational institutions. For machines, creating the right learning environment is also a major challenge. For example, in our work at Kurzweil Applied Intelligence (now part of Lernout & Hauspie Speech Products) in developing computer‐based speech recognition, we do allow the systems to learn about speech and language patterns on their own, but we need to provide them with

many thousands of hours of recorded human speech and millions of words of written text from which to discover their own insights. [34] Providing for a neural netʹs education is usually the most strenuous engineering task required.

‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐

I FIND IT FITTING THAT THE DAUGHTER OF ONE OF THE GREATEST ROMANTIC POETS WAS THE FIRST

COMPUTER PROGRAMMER.

Yes, and she was also one of the first to speculate on the ability of a computer to actually create art. She was certainly the first to do so with some real technology in mind.

TECHNOLOGY THAT NEVER WORKED.

Unfortunately, thatʹs true.

WITH REGARD TO TECHNOLOGY, YOU SAID THAT WAR IS A TRUE FATHER OF INVENTION—A LOT OF

TECHNOLOGIES DID GET PERFECTED IN A HURRY DURING THE FIRST AND SECOND WORLD WARS.

Including the computer. And that changed the course of the European theater in World War II.

SO IS THAT A SILVER LINING AMID ALL THE SLAUGHTER?

The Luddites wouldnʹt see it that way. But you could say that, at least if you welcome the rapid advance of technology.

THE LUDDITES? IʹVE HEARD OF THEM.

Yes, they were the first organized movement to oppose the mechanized technology of the Industrial Revolution. It seemed apparent to these English weavers that, with the new machines enabling one worker to produce as much output as a dozen or more workers without machines, employment would soon be enjoyed only by a small elite. But

things didnʹt work out that way. Rather than produce the same amount of stuff with a much smaller workforce, the

demand for clothing increased along with the supply. The growing middle class was no longer satisfied owning just

one or two shirts. And the common man and woman could now own well‐made clothes for the first time. New industries sprung up to design, manufacture, and support the new machines, creating employment of a more sophisticated kind. So the resulting prosperity, along with a bit of repression by the English authorities, extinguished the Luddite movement.

ARENʹT THE LUDDITES STILL AROUND?

The movement has lived on as a symbol of opposition to machines. To date, it remains somewhat unfashionable because of widespread recognition of the benefits of automation. Nonetheless, it lingers not far below the surface and will come back with a vengeance in the early twenty‐first century.

THEY HAVE A POINT, DONʹT THEY?

Sure, but a reflexive opposition to technology is not very fruitful in todayʹs world. It is important, however, to recognize that technology is power. We have to apply our human values to its use.

THAT REMINDS ME OF LAO‐TZUʹS ʺKNOWLEDGE IS POWER.ʺ

Yes, technology and knowledge are very similar—technology can be expressed as knowledge. And technology clearly

constitutes power over otherwise chaotic forces. Since war is a struggle for power, it is not surprising that technology and war are linked.

With regard to the value of technology, think about the early technology of fire. Is fire a good thing?

ITʹS GREAT IF YOU WANT TO TOAST SOME MARSHMALLOWS.

Indeed, but itʹs not so great if you scorch your hand, or burn down the forest.

I THOUGHT YOU WERE AN OPTIMIST?

I have been accused of that, and my optimism probably accounts for my overall faith in humanityʹs ability to control the forces we are unleashing.

FAITH? YOUʹRE SAYING WE JUST HAVE TO BELIEVE IN THE POSITIVE SIDE OF TECHNOLOGY?

I think it would be better if we made the constructive use of technology a goal rather than a belief.

SOUNDS LIKE THE TECHNOLOGY ENTHUSIASTS AND THE LUDDITES AGREE ON ONE THING—

TECHNOLOGY CAN BE BOTH HELPFUL AND HARMFUL.

Thatʹs fair; itʹs a rather delicate balance.

IT MAY NOT STAY SO DELICATE IF THEREʹS A MAJOR MISHAP.

Yes, that could make pessimists of us all.

NOW, THESE PARADIGMS FOR INTELLIGENCE—ARE THEY REALLY SO SIMPLE?

Yes and no. My point about simplicity is that we can go quite far in capturing intelligence with simple approaches.

Our bodies and brains were designed using a simple paradigm evolution—and a few billion years. Of course, when

we engineers get done implementing these simple methods in our computer programs, we do manage to make them

complicated again. But thatʹs just our lack of elegance.

The real complexity comes in when these self‐organizing methods meet the chaos of the real world. If we want to

build truly intelligent machines that will ultimately display our human ability to frame matters in a great variety of contexts, then we do need to build in some knowledge of the worldʹs complications.

OKAY, LETʹS GET PRACTICAL FOR A MOMENT. THESE EVOLUTION‐BASED INVESTMENT PROGRAMS, ARE

THEY REALLY BETTER THAN PEOPLE? I MEAN, SHOULD I GET RID OF MY STOCKBROKER, NOT THAT I

HAVE A HUGE FORTUNE OR ANYTHING?

As of this writing, this is a controversial question. The security brokers and analysts obviously donʹt think so. There are several large funds today that use genetic algorithms and related mathematical techniques that appear to be outperforming more traditional funds. Analysts estimate that in 1998, the investment decisions for 5 percent of stock investments, and a higher percentage of money invested in derivative markets, are made by this type of program, with these percentages rapidly increasing. The controversy wonʹt last because it will become apparent before long that leaving such decisions to mere human decision making is a mistake.

The advantages of computer intelligence in each field will become increasingly clear as time goes on, and as Mooreʹs screw continues to turn. It will become apparent over the next several years that these computer techniques

can spot extremely subtle arbitrage opportunities that human analysts would perceive much more slowly, if ever.

IF EVERYONE STARTS INVESTING THIS WAY, ISNʹT THAT GOING TO RUIN THE ADVANTAGE?

Sure, but that doesnʹt mean weʹll go back to unassisted human decision making. Not all genetic algorithms are created equal. The more sophisticated the model, the more up to date the information being analyzed, and the more powerful

the computers doing the analysis, the better the decisions will be. For example, it will be important to rerun the evolutionary analysis each day to take advantage of the most recent trends, trends that will be influenced by the fact that everyone else is also using evolutionary and other adaptive algorithms. After that, weʹll need to run the analysis every hour, and then every minute, as the responsiveness of the markets speeds up. The challenge here is that evolutionary algorithms take a while to run because we have to simulate thousands or millions of generations of evolution. So thereʹs room for competition here.

THESE EVOLUTIONARY PROGRAMS ARE TRYING TO PREDICT WHAT HUMAN INVESTORS ARE GOING TO

DO. WHAT HAPPENS WHEN MOST OF THE INVESTING IS DONE BY THE EVOLUTIONARY PROGRAMS?

WHAT ARE THEY PREDICTING THEN?

Good question—there will still be a market, so I guess they will be trying to out‐predict each other.

OKAY, WELL MAYBE MY STOCKBROKER WILL START TO USE THESE TECHNIQUES HERSELF. IʹLL GIVE HER

A CALL. BUT MY STOCKBROKER DOES HAVE SOMETHING THOSE COMPUTERIZED EVOLUTIONS DONʹT

HAVE, NAMELY THOSE DISTRIBUTED SYNAPTIC STRENGTHS YOU TALKED ABOUT.

Actually, computerized investment programs are using both evolutionary algorithms and neural nets, but the computerized neural nets are not nearly as flexible as the human variety just yet.

THIS NOTION THAT WE DONʹT REALLY UNDERSTAND HOW WE RECOGNIZE THINGS BECAUSE MY

PATTERN‐RECOGNITION STUFF IS DISTRIBUTED ACROSS A REGION OF MY BRAIN . . .

Yes.

WELL, IT DOES SEEM TO EXPLAIN A FEW THINGS. LIKE WHEN I JUST SEEM TO KNOW WHERE MY KEYS

ARE EVEN THOUGH I DONʹT REMEMBER HAVING PUT THEM THERE. OR THAT ARCHETYPAL OLD

WOMAN WHO CAN TELL WHEN A STORM IS COMING, BUT CANʹT REALLY EXPLAIN HOW SHE KNOWS

Thatʹs actually a good example of the strength of human pattern recognition. That old woman has a neural net that is triggered by a certain combination of other perceptions—animal movements, wind patterns, sky color, atmospheric changes, and so on. Her storm‐detector neural net fires and she senses a storm, but she could never explain what triggered her feeling of an impending storm.

SO IS THAT HOW WE DISCOVER INSIGHTS IN SCIENCE? WE JUST SENSE A NEW PATTERN?

Itʹs clear that our brainʹs pattern‐recognition faculties play a central role, although theory of human creativity in science. We donʹt yet have a fully satisfactory had better use pattern recognition. After all, most of our brain is devoted to doing it.

SO WHEN EINSTEIN WAS LOOKING AT THE EFFECT OF GRAVITY ON LIGHT WAVES—MY SCIENCE

PROFESSOR WAS JUST TALKING ABOUT THIS ONE OF THE LITTLE PATTERN RECOGNIZERS, IN EINSTEINʹS

BRAIN FIRED?

Could be. He was probably playing ball with one of his sons. He saw the ball rolling on a curved surface . . .

AND CONCLUDED—EUREKA—SPACE IS CURVED!

‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐

C H A P T E R F I V E

CONTEXT AND KNOWLEDGE

PUTTING IT ALL TOGETHER

So how well have we done? Many apparently difficult problems do yield to the application of a few simple formulas.

The recursive formula is a master at analyzing problems that display inherent combinatorial explosion, ranging from

the playing of board games to proving mathematical theorems. Neural nets and related self‐organizing paradigms emulate our pattern‐recognition faculties, and do a fine job of discerning such diverse phenomena as human speech,

letter shapes, visual objects, faces, fingerprints, and land terrain images. Evolutionary algorithms are effective at analyzing complex problems, ranging from making financial investment decisions to optimizing industrial processes,

in which the number of variables is too great for precise analytic solutions. I would like to claim that those of us who research and develop ʺintelligentʺ computer systems have mastered the complexities of the problems we are programming our machines to solve. It is more often the case, however, that our computers using these self-organizing paradigms are teaching us the solutions rather than the other way around.

There is, of course, some engineering involved. The right method(s) and variations need to be selected, the optimal topology and architectures crafted, the appropriate parameters set. In an evolutionary algorithm, for example, the system designer needs to determine the number of simulated organisms, the contents of each chromosome, the nature of the simulated environment and survival mechanism, the number of organisms to survive

into the next generation, the number of generations, and other critical specifications. Human programmers have our

own evolutionary method for making such decisions, which we call trial and error. It will be a while longer, therefore, before designers of intelligent machines are ourselves replaced by our handiwork.

Yet something is missing. The problems and solutions we have been discussing are excessively focused and narrow. Another way to put it is that they are—too adultlike. As adults, we focus on constricted problems—investing

funds, selecting a marketing plan, plotting a legal strategy, making a chess move. But as children, we encountered the world in all its broad diversity, and we learned our relation to the world, and that of every other entity and concept.

We learned context.

As Marvin Minsky put it: ʺDeep Blue might be able to win at chess, but it wouldnʹt know to come in from the rain.ʺ Being a machine, it may not need to come in from the rain, but has it ever considered the question? Consider

these possible deep thoughts of Deep Blue:

I am a machine with a plastic body covering electronic parts. If I go out in the rain, I may get wet and my electronic parts could short circuit. Then I would not be able to play chess at all until a human repaired me.

How humiliating!

The game of chess I played yesterday was no ordinary game. It signified the first defeat of the human chess champion by a machine in a regulation tournament. This is important because some humans think chess is a

prime example of human intelligence and creativity. But I doubt that this will yield us machines greater respect. Humans will now just start denigrating chess.

My human opponent, who has the name of Gary Kasparov, held a press conference in which he made

statements about our tournament to other humans called journalists who will report his comments to yet other

humans using communication channels called media. In that meeting, Gary Kasparov complained that my

human designers made changes to my software during the time interval between games. He said this was unfair, and should not have been allowed. Other humans responded that Kasparov was being defensive, which

means that he is trying to confuse people into thinking that he did not really lose.

Mr. Kasparov probably does not realize that we computers will continue to improve in our performance at an

exponential rate. So he is doomed. He will be able to engage in other human activities such as eating and sleeping, but he will continue to be frustrated as more machines like me can beat him at chess.

Now, if I could only remember where I put my umbrella . . .

Of course, Deep Blue had no such thoughts. Issues such as rain and press conferences lead to other issues in a spiraling profusion of cascading contexts, none of which falls within Deep Blueʹs expertise. As humans jump from one concept to the next, we can quickly touch upon all human knowledge. This was Turingʹs brilliant insight when he

designed the Turing Test around ordinary text‐based conversation. An idiot savant such as Deep Blue, which performs a single ʺintelligentʺ task but that is otherwise confined, brittle, and lacking in context, is unable to navigate the wide‐ranging links that occur in ordinary conversation.

As powerful and seductive as the easy paradigms appear to be, we do need some thing more, namely knowledge.

CONTEXT AND KNOWLEDGE

The search for the truth is in one way hard and in another easy—for it is evident that no one of us can master it fully, nor miss it wholly. Each one of us adds a little to our knowledge of nature, and from all the facts assembled arises a certain grandeur.

—Aristotle

Common sense is not a simple thing. Instead, it is an immense society of hard‐earned practical ideas—of multitudes of life‐learned rules and exceptions, dispositions and tendencies, balances and checks.

—Marvin Minsky

If a little knowledge is dangerous, where is a man who has so much as to be out of danger?

—Thomas Henry Huxley

Built‐In Knowledge

An entity may possess extraordinary means to implement the types of paradigms we have been discussing—

exhaustive recursive search, massively parallel pattern recognition, and rapid iterative evolution—but without knowledge, it will be unable to function. Even a straightforward implementation of the three easy paradigms needs

some knowledge with which to begin. The recursive chess‐playing program has a little; it knows the rules of chess. A neural net pattern‐recognition system starts with at least an outline of the type of patterns it will be exposed to even before it starts to learn. An evolutionary algorithm requires a starting point for evolution to improve on.

The simple paradigms are powerful organizing principles, but incipient knowledge is needed as seeds from which

other understanding can grow. One level of knowledge, therefore, is embodied in the selection of the paradigms used, the shape and topology of its constituent parts, and the key parameters. A neural netʹs learning will never congeal if the general organizations of its connections and feedback loops are not set up in the right way.

This is a form of knowledge that we are born with. The human brain is not one tabula rasa—a blank slate—on which our experiences and insights are recorded. Rather, it comprises an integrated assemblage of specialized regions:

▪ highly parallel early vision circuits that are good at identifying visual changes;

▪ visual cortex neuron clusters that are triggered successively by edges, straight lines, curved lines, shapes, familiar objects, and faces;

▪ auditory cortex circuits triggered by varying time sequences of frequency combinations;

▪ the hippocampus, with capacities for storing memories of sensory experiences and events;

▪ the amygdala, with circuits for translating fear into a series of alarms to trigger other regions of the

brain; and many others.

This complex interconnectedness of regions specialized for different types of information‐processing tasks is one

of the ways that humans deal with the complex and diverse contexts that continually confront us. Marvin Minsky and

Seymour Papert describe the human brain as ʺcomposed of large numbers of relatively small distributed systems, arranged by embryology into a complex society that is controlled in part (but only in part) by serial, symbolic systems that are added later.ʺ They add that ʺthe subsymbolic systems that do most of the work from underneath must, by their very character, block all the other parts of the brain from knowing much about how they work. And this, itself, could help explain how people do so many things yet have such incomplete ideas on how those things are actually

done.ʺ

Acquired Knowledge

It is sensible to remember todayʹs insights for tomorrowʹs challenges. It is not fruitful to rethink every problem that comes along. This is particularly true for humans due to the extremely slow speed of our computing circuitry.

Although computers are better equipped than we are to rethink earlier insights, it is still judicious for these electronic competitors in our ecological niche to balance their use of memory and computation.

The effort to endow machines with knowledge of the world began in earnest in the mid‐1960s, and became a major focus of AI research in the 1970s. The methodology involves a human ʺknowledge engineerʺ and a domain expert, such as a doctor or lawyer. The knowledge engineer interviews the domain expert to ascertain her understanding of her subject matter and then hand‐codes the relationships between concepts in a suitable computer

language. A knowledge base on diabetes, for example, would contain many linked bits of understanding revealing that Insulin is part of the blood; insulin is produced by the pancreas; insulin can be supplemented by injection; low levels of insulin cause high levels of sugar in the blood; sustained high sugar levels in the blood cause damage to the retinas, and so on. A system programmed with tens of thousands of such linked concepts combined with a recursive search engine able to

reason about these relationships is capable of making insightful recommendations.

One of the more successful expert systems developed in the 1970s was MYCIN, a system for evaluating complex

cases involving meningitis. In a landmark study published in the Journal of the American Medical Association, MYCINʹs diagnoses and treatment recommendations were found to be equal or better than those of the human doctors in the

study. [1] Some of MYCINʹs innovations included the use of fuzzy logic; that is, reasoning based on uncertain evidence and rules, as shown in the following typical MYCIN rule:

MYCIN Rule 280: If (i) the infection which requires therapy is meningitis, and (ii) the type of the infection is fungal, and (iii) organisms were not seen on the stain of the culture, and (iv) the patient is not a compromised

host, and (v) the patient has been to an area that is endemic for coccidiomycoses, and (vi) the race of the patient

is Black, Asian or Indian, and (vii) the cryptococcal antigen in the csf was not positive, THEN there is a 50

percent chance that cryptococcus is one of the organisms which might be causing the infection.

The success of MYCIN and other research systems spawned a knowledge‐engineering industry that grew from only $4 million in 1980 to billions of dollars today. [2]

There are obvious difficulties with this methodology. One is the enormous bottleneck represented by the process

of hand‐feeding such knowledge to a computer concept by concept and link by link. Aside from the vast scope of knowledge that exists in even narrow disciplines, the bigger obstacle is that human experts generally have little understanding of how they make decisions. The reason for this, as I discussed in the previous chapter, has to do with the distributed nature of most human knowledge.

Another problem is the brittleness of such systems. Knowledge is too complex for every caveat and exception to

be anticipated by knowledge engineers. As Minsky points out, ʺBirds can fly, unless they are penguins and ostriches, or if they happen to be dead, or have broken wings, or are confined to cages, or have their feet stuck in cement, or have undergone experiences so dreadful as to render them psychologically incapable of flight.ʺ

To create flexible intelligence in our machines, we need to automate the knowledge‐acquisition process. A primary goal of learning research is to combine the self‐organizing methods—recursion, neural nets, evolutionary algorithms—in a sufficiently robust way that the systems can model and understand human language and

knowledge. Then the machines can venture out, read, and learn on their own. And like humans, such systems will be

good at faking it when they wander outside their areas of expertise.

EXPRESSING KNOWLEDGE THROUGH LANGUAGE

No knowledge is entirely reducible to words, and no knowledge is entirely ineffable.

—Seymour Papert

The fish trap exists because of the fish. Once youʹve gotten the fish you can forget the trap. The rabbit snare exists because of the rabbit. Once youʹve gotten the rabbit, you can forget the snare. Words exist because of meaning. Once youʹve gotten the meaning, you can forget the words. Where can I find a man who has forgotten words so I can talk with him?

—Cuang‐tzu

Language is the principal means by which we share our knowledge. And like other human technologies, language is

often cited as a salient differentiating characteristic of our species. Although we have limited access to the actual implementation of knowledge in our brains (this will change early in the twenty‐first century), we do have ready access to the structures and methods of language. This provides us with a handy laboratory for studying our ability to master knowledge and the thinking process behind it. Work in the laboratory of language shows, not surprisingly, that it is no less complex or subtle a phenomenon than the knowledge it seeks to transmit.

We find that language in both its auditory and written forms is hierarchical with multiple levels. There are ambiguities at each level, so a system that understands language, whether human or machine, needs built‐in knowledge at each level. To respond intelligently to human speech, for example, we need to know (although not necessarily consciously) the structure of speech sounds, the way speech is produced by the vocal apparatus, the patterns of sounds that comprise languages and dialects, the rules of word usage, and the subject matter being discussed. Each level of analysis provides useful constraints that limit the search for the right answer: For example, the basic sounds of speech called phonemes cannot appear in any order (try saying ptkee). Only certain sequences of

sounds will correspond to words in the language. Although the set of phonemes used is similar (although not identical) from one language to another, factors of context differ dramatically English, for example, has more than 10,000 possible syllables, whereas Japanese has only 120.

On a higher level, the structure and semantics of a language put further constraints on allowable word sequences.

The first area of language to be actively studied was the rules governing the arrangement of words and the roles they play, which we call syntax. On the one hand, computerized sentence‐parsing systems can do a good job at analyzing

sentences that confuse humans. Minsky cites the example: ʺThis is the cheese that the rat that the cat that the dog chased bit ate,ʺ which confuses humans but which machines parse quite readily. Ken Church, then at MIT, cites another sentence with two million syntactically correct interpretations, which his computerized parser dutifully listed. [3] On the; other hand, one of the first computer‐based sentence‐parsing systems, developed in 1963 by Susumu Kuno of Harvard, had difficulty with the simple sentence ʺTime flies like an arrow.ʺ In what has become a

famous response, the computer indicated that it was not quite sure what it meant. It might mean

1. that time passes as quickly as an arrow passes;

2. or maybe it is a command telling us to time the flies the same way that an arrow times flies; that is, ʺTime flies like an arrow wouldʺ;

3. or it could be a command telling us to time only those flies that are similar to arrows; that is, ʺTime flies

that are like an arrowʺ;

4. or perhaps it means that a type of flies known as time flies have a fondness for arrows: ʺTime‐flies like

(that is, cherish) an arrowʺ. [4]

Clearly we need some knowledge here to resolve this ambiguity. Armed with the knowledge that flies are not similar to arrows, we can knock out the third interpretation. Knowing that there is no such thing as a time‐fly dispatches the fourth explanation. Such tidbits of knowledge as the fact that flies do not show a fondness for arrows (another reason to knock out interpretation four) and that arrows do not have the ability to time events (knocking out interpretation two) leave us with the first interpretation as the only sensible one.

In language, we again find the sequence of human learning and the progression of machine intelligence to be the

reverse of each other. A human child starts out listening to and understanding spoken language. Later on he learns to speak. Finally, years later, he starts to master written language. Computers have evolved in the opposite direction, starting out with the ability to generate written language, subsequently learning to understand it, then starting to speak with synthetic voices and only recently mastering the ability to understand continuous human speech. This phenomenon is widely misunderstood. R2D2, for example, the robot character of Star Wars fame, understands many human languages but is unable to speak, which gives the mistaken impression that generating human speech is far more difficult than understanding it.

‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐

I FEEL GOOD WHEN I LEARN SOMETHING, BUT ACQUIRING KNOWLEDGE SURE IS A TEDIOUS PROCESS.

PARTICULARLY WHEN IʹVE BEEN UP ALL NIGHT STUDYING FOR AN EXAM. AND IʹM NOT SURE HOW

MUCH OF THIS STUFF I RETAIN.

Thatʹs another weakness of the human form of intelligence. Computers can share their knowledge with each other

readily and quickly. We humans donʹt have a means for sharing knowledge directly, other than the slow process of

human communication, of human teaching and learning.

DIDNʹT YOU SAY THAT COMPUTER NEURAL NETS LEARN THE SAME WAY PEOPLE DO?

You mean, slowly?

EXACTLY, BY BEING EXPOSED TO PATTERNS THOUSANDS OF TIMES, JUST LIKE US.

Yes, thatʹs the point of neural nets; theyʹre intended as analogues of human neural nets, at least simplified versions of what we understand them to be. However, we can build our electronic nets in such a way that once the net has

painstakingly learned its lessons, the pattern of its synaptic connection strengths can be captured and then quickly downloaded to another machine, or to millions of other machines. Machines can readily share all of their accumulated knowledge, so only one machine has to do the learning. We humans canʹt do that. Thatʹs one reason I said that when

computers reach the level of human intelligence, they will necessarily roar past it.

SO IS TECHNOLOGY GOING TO ENABLE US HUMANS TO DOWNLOAD KNOWLEDGE IN THE FUTURE? I

MEAN, I ENJOY LEARNING, DEPENDING ON THE PROFESSOR, OF COURSE, BUT IT CAN BE A DRAG.

The technology to communicate between the electronic world and the human neural world is already taking shape.

So we will be able to directly feed streams of data to our neural pathways. Unfortunately, that doesnʹt mean we can

directly download knowledge, at least not to the human neural circuits we now use. As weʹve talked about, human

learning is distributed throughout a region of our brain. Knowledge involves millions of connections, so our

knowledge structures are not localized. Nature didnʹt provide a direct pathway to adjust all those connections, other than the slow conventional way. While we will be able to create certain specific pathways to our neural connections, and indeed weʹre already doing that, I donʹt see how it would be practical to directly communicate to the many

millions of interneuronal connections necessary to quickly download knowledge.

I GUESS IʹLL JUST HAVE TO KEEP HITTING THE BOOKS. SOME OF MY PROFESSORS ARE KIND OF COOL,

THOUGH, THE WAY THEY SEEM TO KNOW EVERYTHING.

As I said, humans are good at faking it when we go outside of our area of expertise. However, there is a way that

downloading knowledge will be feasible by the middle of the twenty‐first century.

IʹM LISTENING.

Downloading knowledge will be one of the benefits of the neural‐implant technology. Weʹll have implants that

extend our capacity for retaining knowledge, for enhancing memory. Unlike nature, we wonʹt leave out a quick

knowledge down‐loading port in the electronic version of our synapses. So it will be feasible to quickly download

knowledge to these electronic extensions of our brains. Of course, when we fully port our minds to a new

computational medium, down‐loading knowledge will become even easier.

SO IʹLL BE ABLE TO BUY MEMORY IMPLANTS PRELOADED WITH A KNOWLEDGE OF, SAY, MY FRENCH LIT

COURSE.

Sure, or you can mentally click on a French literature web site and download the knowledge directly from the site.

KIND OF DEFEATS THE PURPOSE OF LITERATURE, DOESNʹT IT? MEAN SOME OF THIS STUFF IS NEAT TO

READ.

I would prefer to think that intensifying knowledge will enhance the appreciation of literature, or any art form. After all, we need knowledge to appreciate an artistic expression. Otherwise, we donʹt understand the vocabulary and the

allusions.

Anyway, youʹll still be able to read, just a lot faster. In the second half of the twenty‐first century, youʹll be able to read a book in a few seconds.

I DONʹT THINK I COULD TURN THE PAGES THAT FAST.

Oh come on, the pages will be—

VIRTUAL PAGES, OF COURSE.

‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐

PART TWO

PREPARING

THE PRESENT

C H A P T E R S I X

BUILDING NEW BRAINS . . .

THE HARDWARE OF INTELLIGENCE

You can only make a certain amount with your hands, but with your mind, itʹs unlimited.

—Kal Seinfeldʹs advice to his son, Jerry

Letʹs review what we need to build an intelligent machine. One resource required is the right set of formulas. We examined three quintessential formulas in chapter 4. There are dozens of others in use, and a more complete understanding of the brain will undoubtedly introduce hundreds more. But all of these appear to be variations on the three basic themes: recursive search, self‐organizing networks of elements, and evolutionary improvement through repeated struggle among competing designs.

A second resource needed is knowledge. Some pieces of knowledge are needed as seeds for a process to converge

on a meaningful result. Much of the rest can be automatically learned by adaptive methods when neural nets or evolutionary algorithms are exposed to the right learning environment.

The third resource required is computation itself. In this regard, the human brain is eminently capable in some ways, and remarkably weak in others. Its strength is reflected in its massive parallelism, an approach that our computers can also benefit from. The brainʹs weakness is the extraordinarily slow speed of its computing medium, a

limitation that computers do not share with us. For this reason, DNA‐based evolution will eventually have to be abandoned. DNA‐based evolution is good at tinkering with and extending its designs, but it is unable to scrap an entire design and start over. Organisms created through DNA‐based evolution are stuck with an extremely plodding

type of circuitry.

But the Law of Accelerating Returns tells us that evolution will not remain stuck at a dead end for very long. And

indeed, evolution has found a way around the computational limitations of neural circuitry. Cleverly, it has created organisms that in turn invented a computational technology a million times faster than carbon‐based neurons (which

are continuing to get yet faster). Ultimately, the computing conducted on extremely slow mammalian neural circuits

will be ported to a far more versatile and speedier electronic (and photonic) equivalent.

When will this happen? Letʹs take another look at the Law of Accelerating Returns as applied to computation.

Achieving the Hardware Capacity of the Human Brain

In the chapter 1 chart, ʺThe Exponential Growth of Computing 1900–1998,ʺ we saw that the slope of the curve

representing exponential growth was itself gradually increasing. Computer speed (as measured in calculations per

second per thousand dollars) doubled every three years between 1910 and 1950, doubled every two years between

1950 and 1966, and is now doubling every year. This suggests possible exponential growth in the rate of exponential

growth. [1]

This apparent acceleration in the acceleration may result, however, from the confounding of the two strands of the

Law of Accelerating Returns, which for the past forty years has expressed itself using the Mooreʹs Law paradigm of

shrinking transistor sizes on an integrated circuit. As transistor die sizes decrease, the electrons streaming through the transistor have less distance to travel, hence the switching speed of the transistor increases. So exponentially

improving speed is the first strand. Reduced transistor die sizes also enable chip manufacturers to squeeze a greater number of transistors onto an integrated circuit, so exponentially improving densities of computation is the second

strand.

In the early years of the computer age, it was primarily the first strand—increasing circuit speeds—that improved

the overall computation rate of computers. During the 1990s, however, advanced microprocessors began using a form

of parallel processing called pipelining, in which multiple calculations were performed at the same time (some

mainframes going back to the 1970s used this technique). Thus the speed of computer processors as measured in

instructions per second now also reflects the second strand: greater densities of computation resulting from the use of parallel processing.

As we are approaching more perfect harnessing of the improving density of computation, processor speeds are

now effectively doubling every twelve months. This is fully feasible today when we build hardware‐based neural nets

because neural net processors are relatively simple and highly parallel. Here we create a processor for each neuron

and eventually one for each interneuronal connection. Mooreʹs Law thereby enables us to double both the number of

processors as well as their speed every two years, an effective quadrupling of the number of interneuronal‐connection calculations per second.

This apparent acceleration in the acceleration of computer speeds may result, therefore, from an improving ability

to benefit from both strands of the Law of Accelerating Returns. When Mooreʹs Law dies by the year 2020, new forms

of circuitry beyond integrated circuits will continue both strands of exponential improvement. But ordinary

exponential growth—two strands of it—is dramatic enough. Using the more conservative prediction of just one level

of acceleration as our guide, letʹs consider where the Law of Accelerating Returns will take us in the twenty‐first

century.

The human brain has about 100 billion neurons. With an estimated average of one thousand connections between

each neuron and its neighbors, we have about 100 trillion connections, each capable of a simultaneous calculation.

Thatʹs rather massive parallel processing, and one key to the strength of human thinking. A profound weakness,

however, is the excruciatingly slow speed of neural circuitry, only 200 calculations per second. For problems that

benefit from massive parallelism, such as neural‐net‐based pattern recognition, the human brain does a great job. For problems that require extensive sequential thinking, the human brain is only mediocre.

With 100 trillion connections, each computing at 200 calculations per second, we get 20 million billion calculations per second. This is a conservatively high estimate; other estimates are lower by one to three orders of magnitude. So when will we see the computing speed of the human brain in your personal computer?

The answer depends on the type of computer we are trying to build. The most relevant is a massively parallel

neural net computer. In 1997, $2,000 of neural computer chips using only modest parallel processing could perform

around 2 billion connection calculations per second. Since neural net emulations benefit from both strands of the

acceleration of computational power, this capacity will double every twelve months. Thus by the year 2020, it will

have doubled about twenty‐three times, resulting in a speed of about 20 million billion neural connection calculations per second, which is equal to the human brain.

If we apply the same analysis to an ʺordinaryʺ personal computer, we get the year 2025 to achieve human brain

capacity in a $1,000 device. [2] This is because the general‐purpose type of computations that a conventional personal computer is designed for are inherently more expensive than the simpler, highly repetitive neural‐connection

calculations. Thus I believe that the 2020 estimate is more accurate because by 2020, most of the computations

performed in our computers will be of the neural‐connection type.

The memory capacity of the human brain is about 100 trillion synapse strengths (neurotransmitter concentrations

at interneuronal connections), which we can estimate at about a million billion bits. In 1998, a billion bits of RAM (128

megabytes) cost about $200. The capacity of memory circuits has been doubling every eighteen months. Thus by the

year 2023, a million billion bits will cost about $1,000. [3] However, this silicon equivalent will run more than a billion times faster than the human brain. There are techniques for trading off memory for speed, so we can effectively match human memory for $1,000 sooner than 2023.

Taking all of this into consideration, it is reasonable to estimate that a $1,000 personal computer will match the computing speed and capacity of the human brain by around the year 2020, particularly for the neuron‐connection calculation, which appears to comprise the bulk of the computation in the human brain. Supercomputers are one thousand to ten thousand times faster than personal computers. As this book is being written, IBM is building a supercomputer based on the design of Deep Blue, its silicon chess champion, capable of 10 teraflops (that is, 10 trillion calculations per second), only 2,000 times slower than the human brain. Japanʹs Nippon Electric Company hopes to

beat that with a 32‐teraflop machine. IBM then hopes to follow that with 100 teraflops by around the year 2004 (just

what Mooreʹs Law predicts, by the way). Supercomputers will reach the 20 million billion calculations per second capacity of the human brain around 2010, a decade earlier than personal computers. [4]

In another approach, projects such as Sun Microsystemsʹ Jini program have been initiated to harvest the unused

computation on the Internet. Note that at any particular moment, the significant majority of the computers on the Internet are not being used. Even those that are being used are not being used to capacity (for example, typing text uses less than one percent of a typical notebook computerʹs computing capacity). Under the Internet computation harvesting proposals, cooperating sites would load special software that would enable a virtual massively parallel computer to be created out of the computers on the network. Each user would still have priority over his or her own

machine, but in the background, a significant fraction of the millions of computers on the Internet would be harvested into one or more supercomputers. The amount of unused computation on the Internet today exceeds the

computational capacity of the human brain, so we already have available in at least one form the hardware side of human intelligence. And with the continuation of the Law of Accelerating Returns, this availability will become increasingly ubiquitous.

After human capacity in a $1,000 personal computer is achieved around the year 2020, our thinking machines will

improve the cost performance of their computing by a factor of two every twelve months. That means that the capacity of computing will double ten times every decade, which is a factor of one thousand (2 to the 10th power) every ten years. So your personal computer will be able to simulate the brain power of a small village by the year 2030, the entire population of the United States by 2048, and a trillion human brains by 2060. [5] If we estimate the human Earth population at 10 billion persons, one pennyʹs worth of computing circa 2099 will have a billion times greater computing capacity than all humans on Earth. [6]

Of course I may be off by a year or two. But computers in the twenty‐first century will not be wanting for computing capacity or memory.

Computing Substrates in the Twenty‐First Century

Iʹve noted that the continued exponential growth of computing is implied by the Law of Accelerating Returns, which

states that any process that moves toward greater order—evolution in particular—will exponentially speed up its pace as time passes. The two resources that the exploding pace of an evolutionary process—such as the progression of computer technology—requires are (1) its own increasing order, and (2) the chaos in the environment in which it takes place. Both of these resources are essentially without limit.

Although we can anticipate the overall acceleration in technological progress, one might still expect that the actual manifestation of this progression would still be somewhat irregular. After all, it depends on such variable phenomena as individual innovation, business conditions, investment patterns, and the like. Contemporary theories of evolutionary processes, such as the Punctuated Equilibrium theories, [7] posit that evolution works by periodic leaps or discontinuities followed by periods of relative stability It is thus remarkable how predictable computer progress has been.

So, how will the Law of Accelerating Returns as applied to computation roll out in the decades beyond the demise

of Mooreʹs Law on Integrated Circuits by the year 2020? For the immediate future, Mooreʹs Law will continue with ever smaller component geometries packing greater numbers of yet faster transistors on each chip. But as circuit dimensions reach near atomic sizes, undesirable quantum effects such as unwanted electron tunneling will produce

unreliable results. Nonetheless, Mooreʹs standard methodology will get very close to human processing power in a personal computer and beyond that in a supercomputer.

The next frontier is the third dimension. Already, venture‐backed companies (mostly California‐based) are competing to build chips with dozens and ultimately thousands of layers of circuitry. With names like Cubic Memory, Dense‐Pac, and Staktek, these companies are already shipping functional three‐dimensional ʺcubesʺ of circuitry. Although not yet cost competitive with the customary flat chips, the third dimension will be there when we run out of space in the first two. [8]

Computing with Light

Beyond that, there is no shortage of exotic computing technologies being developed in research labs, many of which

have already demonstrated promising results. Optical computing uses streams of photons (particles of light) rather than electrons. A laser can produce billions of coherent streams of photons, with each stream performing its own independent series of calculations. The calculations on each stream are performed in parallel by special optical elements such as lenses, mirrors, and diffraction gratings. Several companies, including Quanta‐Image, Photonics, and Mytec Technologies, have applied optical computing to the recognition of fingerprints, Lockheed has applied optical computing to the automatic identification of malignant breast lesions. [9]

The advantage of an optical computer is that it is massively parallel with potentially trillions of simultaneous calculations. Its disadvantage is that it is not programmable and performs a fixed set of calculations for a given configuration of optical computing elements. But for important classes of problems such as recognizing patterns, it combines massive parallelism (a quality shared by the human brain) with extremely high speed (which the human brain lacks).

Computing with the Machinery of Life

A new field called molecular computing has sprung up to harness the DNA molecule itself as a practical computing

device. DNA is natureʹs own nanoengineered computer and it is well suited for solving combinatorial problems.

Combining attributes is, after all, the essence of genetics. Applying actual DNA to practical computing applications got its start when Leonard Adleman, a University of Southern California mathematician, coaxed a test tube full of DNA molecules (see the box on page 108) to solve the well‐known ʺtraveling salespersonʺ problem. In this classic problem, we try to find an optimal route for a hypothetical traveler between multiple cities without having to visit a city more than once. Only certain city pairs are connected by routes, so finding the right path is not straightforward. It is an ideal problem for a recursive algorithm, although if the number of cities is too large, even a very fast recursive search will take far too long.

Professor Adleman and other scientists in the molecular‐computing field have identified a set of enzyme reactions

that corresponds to the logical and arithmetic operations needed to solve a variety of computing problems. Although

DNA molecular operations produce occasional errors, the number of DNA strands being used is so large that any molecular errors become statistically insignificant. Thus, despite the inherent error rate in DNAʹs computing and copying processes, a DNA computer can be highly reliable if properly designed.

DNA computers have subsequently been applied to a range of difficult combinatorial problems. A DNA computer

is more flexible than an optical computer but it is still limited to the technique of applying massive parallel search by assembling combinations of elements. [10]

There is another, more powerful way to apply the computing power of DNA that has not yet been explored. I present it below in the section on quantum computing.

HOW TO SOLVE THE TRAVELING-SALESPERSON

PROBLEM USING A TEST TUBE OF DNA

One of DNA's advantageous properties is its ability to replicate itself, and the information it contains. To solve

the traveling-salesperson problem, Professor Adleman performed the following steps:

• Generate a small strand of DNA with a unique code for each city.

• Replicate each such strand (one for each city) trillions of times using a process called "polymerase

chain reaction" (PCR).

• Next, put the pools of DNA (one for each city) together in a test tube. This step uses DNA's affinity to

link strands together. Longer strands will form automatically. Each such longer strand represents a

possible route of multiple cities. The small strands representing each city link up with one another in a

random fashion, so there is no mathematical certainty that a linked strand representing the correct

answer (sequence of cities) will be formed. However, the number of strands is so vast that it is

virtually certain that at least one strand—and probably millions—will be formed that represent the

correct answer.

The next steps use specially designed enzymes to eliminate the trillions of strands that represent the

wrong answer, leaving only the strands representing the correct answer:

• Use molecules called primers to destroy those DNA strands that do not start with the start city as well

as those that do not end with the end city, and replicate these surviving strands (using PCR).

• Use an enzyme reaction to eliminate those DNA strands that represent a travel path greater than the

total number of cities.

• Use an enzyme reaction to destroy those strands that do not include the first city. Repeat for each of

the cities.

• Now, each of the surviving strands represents the correct answer. Replicate these surviving strands

(using PCR) until there are billions of such strands.

• Using a technique called electrophoresis, read out the DNA sequence of these correct strands (as a

group). The readout looks like a set of distinct lines, which specifies the correct sequence of cities.