The quest for the origins of the Indo-Europeans has all the fascination of an electric light in the open air on a summer night: it attracts every species of scholar or would-be savant who can take pen to hand. For over 200 years, theories have been put forward advocating ages ranging from 4000 to 23,000 years, with hypothesized homelands including Central Europe, the Balkans, and even India. Unfortunately, archaeological, genetic and linguistic research on Indo-European origins has so far proved inconclusive.
In early historical times, Indo-European languages were present in central and northern Europe, southeastern Europe, and much of southern and southwest Asia. Scholars have long theorized about what pre-historic events might have caused this group to become so widespread.
In the late 1700s, William Jones an English judge serving in India, discovered something startling; that Sanskrit possesses striking similarities to Greek, Latin, and Celtic. He theorized that they all sprang from a common source, an even more ancient language that had since become extinct.
Since Jones’ time, linguists have verified and greatly expanded on his discovery. Languages that fall into this Indo-European classification include Romance languages, Germanic languages, including of course English, Balto-Slavic, Indo-Iranian and Celtic, plus extinct languages such as Tocharian, spoken in parts of China, and Hittite, spoken in Asia Minor. It’s a huge and incredibly diverse group that derives from the speech of one ancient and forgotten people. The questions remain of who they were, when they lived, and where they came from.
Family Tree of Languages Has Roots in Anatolia, Biologists Say
Biologists using tools developed for drawing evolutionary family trees say that they have solved a longstanding problem in archaeology: the origin of the Indo-European family of languages.
The family includes English and most other European languages, as well as Persian, Hindi and many others. Despite the importance of the languages, specialists have long disagreed about their origin.
Linguists believe that the first speakers of the mother tongue, known as proto-Indo-European, were chariot-driving pastoralists who burst out of their homeland on the steppes above the Black Sea about 4,000 years ago and conquered Europe and Asia. A rival theory holds that, to the contrary, the first Indo-European speakers were peaceable farmers in Anatolia, now Turkey, about 9,000 years ago, who disseminated their language by the hoe, not the sword.
The new entrant to the debate is an evolutionary biologist, Quentin Atkinson of the University of Auckland in New Zealand. He and colleagues have taken the existing vocabulary and geographical range of 103 Indo-European languages and computationally walked them back in time and place to their statistically most likely origin.
The result, they announced in Thursday’s issue of the journal Science, is that “we found decisive support for an Anatolian origin over a steppe origin.” Both the timing and the root of the tree of Indo-European languages “fit with an agricultural expansion from Anatolia beginning 8,000 to 9,500 years ago,” they report.
But despite its advanced statistical methods, their study may not convince everyone.
The researchers started with a menu of vocabulary items that are known to be resistant to linguistic change, like pronouns, parts of the body and family relations, and compared them with the inferred ancestral word in proto-Indo-European. Words that have a clear line of descent from the same ancestral word are known as cognates. Thus “mother,” “mutter” (German), “mat’ ” (Russian), “madar” (Persian), “matka” (Polish) and “mater” (Latin) are all cognates derived from the proto-Indo-European word “mehter.”
Dr. Atkinson and his colleagues then scored each set of words on the vocabulary menu for the 103 languages. In languages where the word was a cognate, the researchers assigned it a score of 1; in those where the cognate had been replaced with an unrelated word, it was scored 0. Each language could thus be represented by a string of 1’s and 0’s, and the researchers could compute the most likely family tree showing the relationships among the 103 languages.
A computer was then supplied with known dates of language splits. Romanian and other Romance languages, for instance, started to diverge from Latin after A.D. 270, when Roman troops pulled back from the Roman province of Dacia. Applying those dates to a few branches in its tree, the computer was able to estimate dates for all the rest.
The computer was also given geographical information about the present range of each language and told to work out the likeliest pathways of distribution from an origin, given the probable family tree of descent. The calculation pointed to Anatolia, particularly a lozenge-shaped area in what is now southern Turkey, as the most plausible origin — a region that had also been proposed as the origin of Indo-European by the archaeologist Colin Renfrew, in 1987, because it was the source from which agriculture spread to Europe.
Dr. Atkinson’s work has integrated a large amount of information with a computational method that has proved successful in evolutionary studies. But his results may not sway supporters of the rival theory, who believe the Indo-European languages were spread some 5,000 years later by warlike pastoralists who conquered Europe and India from the Black Sea steppe.
A key piece of their evidence is that proto-Indo-European had a vocabulary for chariots and wagons that included words for “wheel,” “axle,” “harness-pole” and “to go or convey in a vehicle.” These words have numerous descendants in the Indo-European daughter languages. So Indo-European itself cannot have fragmented into those daughter languages, historical linguists argue, before the invention of chariots and wagons, the earliest known examples of which date to 3500 B.C. This would rule out any connection between Indo-European and the spread of agriculture from Anatolia, which occurred much earlier.
“I see the wheeled-vehicle evidence as a trump card over any evolutionary tree,” said David Anthony, an archaeologist at Hartwick College who studies Indo-European origins.
Historical linguists see other evidence in that the first Indo-European speakers had words for “horse” and “bee,” and lent many basic words to proto-Uralic, the mother tongue of Finnish and Hungarian. The best place to have found wild horses and bees and be close to speakers of proto-Uralic is the steppe region above the Black Sea and the Caspian. The Kurgan people who occupied this area from around 5000 to 3000 B.C. have long been candidates for the first Indo-European speakers.
In a recent book, “The Horse, the Wheel and Language,” Dr. Anthony describes how the steppe people developed a mobile society and social system that enabled them to push out of their homeland in several directions and spread their language east, west and south.
Dr. Anthony said he found Dr. Atkinson’s language tree of Indo-European implausible in several details. Tocharian, for instance, is a group of Indo-European languages spoken in northwest China. It is hard to see how Tocharians could have migrated there from southern Turkey, he said, whereas there is a well-known migration from the Kurgan region to the Altai Mountains of eastern Central Asia, which could be the precursor of the Tocharian-speakers who lived along the Silk Road.
Dr. Atkinson said that this was a “hand-wavy argument” and that such conjectures should be judged in a quantitative way.
Dr. Anthony, noting that neither he nor Dr. Atkinson is a linguist, said that cognates were only one ingredient for reconstructing language trees, and that grammar and sound changes should also be used. Dr. Atkinson’s reconstruction is “a one-legged stool, so it’s not surprising that the tree it produces contains language groupings that would not survive if you included morphology and sound changes,” Dr. Anthony said.
Dr. Atkinson responded that he did indeed run his computer simulation on a grammar-based tree constructed by Don Ringe, an expert on Indo-European at the University of Pennsylvania, but that the resulting origin was, again, Anatolia, not the Pontic steppe.
Origin of Indo-European Languages Traced to Turkey Using New Mapping Tool
Over the course of the previous two decades, researches have posited the theory that primitive forms of the Indo-European language were spoken across Europe some thousands of years earlier than had previously been assumed. The English language is a member of the large Indo-European language family, spoken in a wide swath of the world. North America, South America, Africa, Australia, and much of Southeast Asia speaks a language that belongs in the family. While it is indisputable that the language family is spoken by a large chunk of people, its origins are more questionable. One predominant theory had placed the origins of the languages in the Pontic-Caspian steppes. Now a team of researchers has concluded that the origins of the English language, as well as the other languages in the Indo-European family, are in a region of Turkey.
Quentin D. Atkinson from the Australian National University and England’s University of Oxford, along with his team of multinational researchers from Australia, Belgium, the Netherlands, New Zealand and the United States, used a rather interesting method to discover what they believe is the origin of the language family. They borrowed a technique used in mapping the geographic origin of viral outbreaks such as HIV and H1N1. Their map led them to establish the origin of the language family in Anatolia, a southern peninsula of what is now Turkey, between 8,000 and 9,500 years ago. The language is believed to have spread with the expansion of farming. Today, the University of Auckland published this research in the Journal Science.
“If you know how viruses are related to one another you can trace back through their ancestry and find out where they originated,” explains lead researcher Dr Quentin Atkinson from the Department of Psychology. “We’ve used those methods and applied them to languages.” For this study, the researchers compared cognates for two languages, which look similar and mean the same thing. For example, the English “chair” is a cognate for the French “chaise”.
Dr Atkinson worked with researchers from Europe and North America as well as with computer scientists Dr Remco Bouckaert and Associate Professor Alexei Drummond and fellow psychologist Professor Russell Gray, all from The University of Auckland.
The study examined basic vocabulary terms and geographic information from 103 ancient and contemporary Indo-European languages. The location and age of the languages’ common ancestor supported the Anatolian hypothesis. The extinct ones, like Hittite, were used because they were in existence 3,000-some odd years ago, providing linguists a way to reach back in time.
They then incorporated important historical events, like the fall of the Roman Empire, to discern a time period for the languages’ evolution.
The findings are consistent with the expansion of agriculture into Europe via the Balkans, reaching the edge of western European by 5,000 years ago. They are also consistent with genetic and skull-measurement data which indicates an Anatolian contribution to the European gene pool.
The work follows a 2003 Nature paper from the same research group, which first used methods from evolutionary biology to build the languages’ family tree. The age of the tree was consistent with Anatolian origins as opposed to the more conventional view that the languages emerged thousands of years later near the Caspain Sea.
“The two competing theories imply two different ages and locations for the origin of the language family. We initially used the age of the family to test the theories,” says Dr Atkinson of the original work.
While the findings made a strong case for the Anatolian hypothesis, some members of the research community remained unconvinced.
The current research, which includes both geographic and historical data, confirms the languages’ Anatolian origins. “It reinforces our earlier findings, and applies exciting new methods from epidemiology to study languages,” says Dr Atkinson. “We’ve developed an entirely new methodology for inferring human prehistory from language data. It allows us to place these language family trees on a map in space and time and play out histories over the landscape.”
The Indo-European languages, a family of several hundred languages and dialects, are spoken by almost three billion native speakers and include languages such as English, Spanish, French, German, Hindi and Bengali.
The conventional “steppe hypothesis” posits that the languages originated in the Pontic steppe region north of the Capsian Sea, and were spread into Europe and the Near East by Kurgan semi-nomadic pastoralists beginning 5,000 to 6,000 years ago.
The “Anatolian hypothesis” argues that the languages spread with the expansion of agriculture from Anatolia beginning 8,000 to 9,000 years ago.
Indo-European languages came from a common root about 15,000 years ago
“Latin is a latecomer” says Professor Mark Pagel, an evolutionary biologist at the University of Reading, “Today we think of Latin as an ancient language, a dead language, the language of antiquity, but that is a relative newcomer on the European language scene, it was used just 2,000-4,000 years ago.”
Using statistical models, the professor has traced back a common root of all Indo- European languages to a proto-language that was probably spoken about 15,000 years ago and has formed the common root for about 7 language families today; language families which include modern day Turkish, Uzbek and Mongolian in the Altaic family, Chukchi-Kamchatkan, spoken in northeastern Siberia, Dravidian, spoken in Southern India, Inuit-Yupik spoken in the Arctic, Kartvelian, which evolved into Georgian, and Uralic, which is the mother of Finnish and Hungarian, and of course, most other European languages, too.
Language that stayed the course
Up until this study, most linguists agreed that they could trace back the origins of our language about 8,000-9,000 years. Before that, it was thought that most words used prior to that date would have already disappeared, been eroded from the languages we know today, but according to a new study in PNAS, (The Proceeding of the National Academy of Sciences) some words we still use today can trace their origins back to a common Indo-European language that was spoken right across Southern Europe, modern day Turkey and Iraq.
23 common words…
They took a list of about 23 commonly used words in our language, words like “I”, “we”, “mother”, and “man”, but also more surprising ones like “bark”, from a tree, and the verb “to spit”. This all started with work Pagel and his team did about five years ago, which found that they were able to trace the ancestry of words by analyzing how often they appeared in speech today, and what type of word (be it adverb, verb or noun) it was.
The words which appeared most commonly, were found to have evolved the slowest, or been eroded the least, and so they decided that the most common words in our language vocabulary today might also be the oldest and the most like their original sound. Through that, they were able to show that words could outlive the original time frames and could have come from much longer ago.
…With ‘a common starting point’
“So we took that statistical framework, to predict words which have evolved so slowly that they might have lasted long enough to have been retained among the language families of Eurasia,” Pagel told DW. “So, if you entertain the idea that they all had a common starting point – and those words went forward in time through the various language families and evolved so slowly that we can turn the clock backwards – then, we discover that, indeed, they are similar in these language families.”
Even today, across the European languages, the word for “I” for instance has a common root, I in English, Ich in German, Je in French, Io in Italian, Yo in Spanish. In the Proto-European language this was more what we would recognize as the word for “me” or the sound “meh” or “beh” says Pagel.
At the end of the last Ice Age
To put it in context, 15,000 years ago is when Europe was just emerging from the last Ice Age. The continent had been depopulated as the ice sheets spread southwards, and gradually, as they receded, hunter-gatherer peoples spread out from southern Europe, further north, west and east.
The origin of this proto-language probably was spoken in and around the area which is modern-day Turkey, and up in to Iraq, which was then known as Mesopotamia. “Modern humans with language were in Europe maybe 40,000 years ago, but what we think is happening with this Eurasian language super family is a spread which is restricted more or less with this retreat of the ice sheet, so it wasn’t a movement down south, where there would have been fairly well entrenched people living in temperate or tropical climates. But this whole new area of Eurasia was opening up as the ice sheets retreated, so maybe that’s why it’s sort of a northern expansion.”
That early proto-European language probably had something of the language structure of German, Pagel thinks. At least in the subject, object, verb order of a sentence, as opposed to the subject, verb, object order found in modern-day English, French, Italian and Spanish.
‘Mama’ has stayed the course, but so has ‘to spit’
The 23 common root words were mostly not surprises, mother or mama is similar to the first babblings a baby makes, and also similar to the proto words which were things like “umma” “imma” and emma”, so the word didn’t need to evolve that much.
But other words, like “bark” (of the tree) were more surprising for us in a modern context, and not a word that you use much in conversation today. However, that came about, thinks Pagel because bark was so important to our forebears, they wove it into buildings, clothes, tools and weapons, which is perhaps why it has stuck with us throughout the centuries. The verb “to spit” too, was a surprising one, thought Pagel, until he consulted linguists, who found the common root was an onomatopoeic sound which sounds like the spitting action itself.