Discussion Problems on
Language Relatedness
and the Methodology
of Classification


Back to top

"In Search of the First Language"
Discussion Questions (APS page 23)

(1) The comparative method (using inter-language comparisons to establish systematic correspondences) and the method of mass comparison are generally treated as two differing ways of working out genetic relationships between languages.

Data for both comparative method and mass comparison consists of forms which combine sound/meaning resemblances (particularly lexical items, but also grammatical morphemes).
The COMPARATIVE METHOD relies on finding repeated and regular sound correspondences which can be attributed to direct descent from a common ancestor. Note that any cognate items are relevant, i.e. compared items often are not from basic vocabulary. Thus, English tame and Latin domus could be used to verify t/d correspondences between these languages (also see in two/duo, tooth/dentis, etc.), even though the meanings are different and at least the English word would not appear in a list of "basic" vocabulary.
MASS COMPARISON compares lists of basic vocabulary from as many languages as possible seeking resemblances in vocabulary across languages which are too numerous to attribute to chance (cf. the notebooks which Greenberg describes in the film). Demonstrated regular sound correspondences, while desirable, are not required. One does not usually stray far from basic vocabulary since once too much semantic freedom is allowed, e.g. something comparing the word meaning 'see' (an item of basic vocabulary) in one language with a word meaning 'disease of the eyes' (a non-basic item) in some other language, one loses control of what constitutes legitimate comparisons.
The purpose of MASS COMPARISON is to establish genetic relationship between languages. The primary purpose of the COMPARATIVE METHOD is to reconstruct proto-forms which can be related to their reflexes by regular changes. To attempt such a task assumes genetic relationship, but some practitioners would claim that without the presence of documentable regularity in sound correspondences, such a relation is not proven.
The two pursuits should not be in conflict inasmuch as their goals are different--mass comparison seeks to discover linguistic relationships which are not immediately obvious, the comparative method confirms those relationships and makes their nature more precise.
Most criticisms are aimed at MASS COMPARISON. Critics state that there are no controls on what passes for a "resemblance" in either the sounds compared or in the range of meanings allowed for compared items. No one would criticize the comparative method per se, but proponents of mass comparison as a technique for arriving at genetic groups would argue that if absolute regularity of sound correspondence and an understanding of the history of every form were a requirement to "prove" genetic relationship, only the most obvious groupings would ever be arrived at.

(2) Why do some linguists believe there is a temporal limit beyond which it is impossible to provide any convincing evidence of genetic relationship? About how far back is that temporal limit? How is this putative limitation reflected in present-day classification of the world's languages?

A number of researchers would argue that the limit for demonstrating genetic relationship between languages is about 10,000 years from the time of the original ancestor language. At that depth, given what we know of the rate of vocabulary change, nothing would be left of the original vocabulary in the descendent languages, and any apparent resemblances could not be convincingly shown to be cognates rather than chance resemblances. Evidence for this claim is the fact that none of the well-accepted families are reconstructed as having histories longer than 10K years, and most such families have shorter histories.

(3) What sorts of classifications have been proposed for American Indian languages? What controversies have arisen in classification of these languages?

Classifications range from several hundred families of languages in North and South America down to just three big families. The most vitriolic controversy has centered on Greenberg's 1987 Language in the Americas, with just three big families. The arguments have been carried in popular print media, academic journals, and conferences. The geneticist Luigi Cavali-Sforza claims to have established human genetic support for Greenberg's groups from DNA sampling. More conservative linguistics consider Greenberg's classification to be just pre-scientific eyeballing of data, pointing out that we have no reason to believe that just three big migrations took place into the Americans, and even if there were just three such migrations, there is no reason to believe that all the members of each migration spoke related languages.


(4) What is Nostratic?

Nostratic is a "super-phylum" consisting of grouping Indo-European, Uralic, Altaic, Dravidian, Caucasian (or Kartvelian = South Caucasian), and Afroasiatic.

This grouping was proposed by V.M. Illic'-Svityc' in the early 1960's. Two of its main living proponents, both seen in the film are Vitaly Shevroshkin and Aharon Dogolpolsky.
These researchers claim to have reconstructed several hundred items of vocabulary exhibiting regular sound correspondences. See box, page 22 of the APS book, for an illustration from Dogolpolsky.

(5) What range of views can be found among linguists on the reconstructability or even the existence of "proto-Human"?

Views on proto-Human range from proposals by people like Merrit Ruhlen for reconstruction of a few vocabulary items to complete dismissal of the idea that we can say anything at all about a language that may have been spoken 100,000 years ago. Nonetheless, most linguistics probably accept the idea that all human languages have a single origin. Language is genetically part of being human. To argue that language had multiple origins would be like arguing that the human shaped vocal tract had multiple origins.

(6) Give some ways that (at least some linguists have claimed that) linguistic classification and reconstruction has implications for human history.

The video mentions at least two:


Back to top

Practice Classifying Languages
through Mass Comparison

Group the languages in the table below into two or more genetic groups based on shared vocabulary resemblances

In each row, the words which seem to go together are color coded, e.g. in the row for 'one', languages B and D seem to go together (coded as red), languages E and H seem to go together (coded as green), and languages I and K seem to go together (coded as aqua). Words in black do not seem to group with other words in the row. Using this color coding, we can look down the columns and see that for each colum, one color seems to predominate. The languages with a particular predominating color form our apparent genetic groups--see below.

Genetic groupings of the languages in the table above:

Saharan Niger-Congo Chadic
A. Berti
E. Teda
H. Kanuri
J. Zaghawa
B. Kotopo (Adamawa)
F. Ahlõ (Kwa)
G. proto-Bantu
I. Efik (Benue-Congo)
C. Bolanci
D. Miya
K. Ngizim

Are there any resemblances which might be a result of chance, borrowing, or sound symbolism rather than historical descent from a common origin?

Within the genetic groups, are there any subgroupings which emerge?

The only sub-grouping that emerges with any clarity is E (Teda) and H (Kanuri) within Saharan. These languages share more resemblances than either does with either of the other Saharan languages, the words for 'eye', 'ear', and 'mouth' are very close, and most important, they share a word for 'mouth' which is different from that word in the other Saharan languages. For subgrouping, we look for shared innovations rather than shared retentions from the proto-Language.

Had we included Hausa as one of the Chadic languages, we could see such a shared innovation with Bolanci in the word for 'two', which is biyu in Hausa. As we noted above, Bolanci bolau is known to be a borrowing from Niger-Congo. The native Chadic root is the *s-r- root seen in Miya and Ngizim. Bolanci and Hausa share the innovative (borrowed) word for 'two' and thus can be recognized as belonging to a separate Chadic subgroup from Miya and Ngizim. On the other hand, we CANNOT group Miya and Ngizim as a subgroup on the basis of the word for 'two'. They share this root because they inherited it from proto-Chadic, not because they share an innovation.

Back to top