Phylogenetic happenings in September 2015

Humans who read grammars can use a variety of tool sets to analyse the linguistic diversity they encounter. One of these tool sets that is still relatively new to this particular set of humans is called 'phylogenetic comparative methods'. These methods assess linguistic features as they evolve on the branches of a family tree (aka phylogenetic tree). The combination of information on the history of a language family with data on typological features allows typologists to do cool things like infer what ancestral languages were like and how quickly features change.

For unknown reasons, magical things happened, planets aligned and in the first two weeks of September EIGHT new cool phylogenetic studies appeared! Let's have a look!

First up are several talks presented at the "Historical relationships among languages of the Americas" workshop held during the 48th Annual Meeting of the Societas Linguistica Europaea that was held in Leiden (NL) in early September. A number of talks dealt with the reconstruction of the genealogy of language families, and most of these used phylogenetic methods: Thiago Chacon and colleagues presented "New perspectives on Tukanoan language history: A combined framework of quantitative and qualitative approaches", Sérgio Meira and colleagues discussed "A character-based internal classification of the Cariban language family", and Elisabeth Norcliffe and colleagues focused on "The reconstruction and classification of the Barbacoan family of languages". How cool is that! Unfortunately I cannot discuss all of these here, but I'll add links to slides when they become available.

Whereas those three talks presented phylogenetic classifications of various types, Natalia Chousou-Polydouri and colleagues presented a phylogenetic comparative study with the title "Phylogenetic analysis of morphosyntactic data: A case study of negation in Tupí-Guaraní". The same team has previously worked on a phylogenetic tree of the Tupí-Guaraní languages, which are spoken in Amazonia and surrounding areas. In their SLE talk, Chousou-Polydouri et al. present an analysis of negation in 29 Tupí-Guaraní languages. They look at particular constructions used for negation, such as reflexes of the reconstructed morpheme *ani for 'no', as well as special constructions for negative imperatives and directives. They use ancestral state estimations to assess which of these morphemes and constructions can be reconstructed for Proto-Tupí-Guaraní, and show that there is evidence for 6 negator morphemes and 5 different negation constructions in Proto-Tupí-Guaraní. So now we know more about linguistic change in negation strategies in Tupí-Guaraní, and there is more to come from this team, as they are working on a big morphosyntax database!

To stay on the topic of Tupí(-Guaraní), last week a special issue of the Boletim do Museu Paraense Emílio Goeldi came out that focuses on the Tupian languages. In this special issue two phylogenetic studies appear, the first is by Ana Vilacy Galucio and colleagues and is entitled "Genealogical relations and lexical distances within the Tupian linguistic family". This paper is a phylogenetic investigation of the Tupian language family (of which Tupí-Guaraní is a subgroup) - the authors us distance-based methods to investigate the relations between 23 Tupian languages. Galucio et al. compare several different lexical datasets: first and foremost a 100-item Swadesh list, but also subsets of this list that only feature the most retentive (more stable) and least retentive (less stable) items, as well as a dataset of 90 plant and animal names. They find quite tree-like networks, suggesting that the Tupian languages diversified through periods of migration or political separation that created the major subgroups. However, their analysis on plant and animal names suggests that there has been undetected borrowing, indicating some form of (subsequent) language contact between some of the Tupian subgroups. More is to come from this team, as they will continue to investigate the phylogenetics of the Tupian family further using character-based methods!

The second phylogenetic study in the special issue is Joshua Birchall's "A comparison of verbal person marking across Tupian languages". Birchall has data on verbal person marking from 16 Tupian languages. He studies how verbal person marking has changed on the branches of two different phylogenetic trees, an expert classification as well as a tree inferred on the basis of lexical data. He finds that Proto-Tupian is most likely to have marked the subject and the object of both transitive and intransitive clauses on its verbs. However, subsequent changes have affected person marking on transitive and intransitive verbs differently. Birchall complements this phylogenetic analysis with an overview of the verbal person markers and their cognacy across languages, showing that quantitative and qualitative methods can go hand in hand to shed light on linguistic diversity :).

Let's move from South America to Africa! On the 14th, a study by Rebecca Grollemund and colleagues came out in the early edition of PNAS. Entitled "Bantu expansion shows that habitat alters the route and pace of human dispersals", this is a phylogenetic study that uses ancestral state estimation to infer the route by which Bantu speakers inhabited most of Sub-Saharan Africa. First, Grollemund et al. reconstruct a dated phylogenetic tree on the basis of lexical data from 409 Bantu and 15 Bantoid languages. Then, they use this tree to reconstruct the location of ancestral languages that must have been spoken on the route from what is now northwest Cameroon all the way to easternmost Kenya and southernmost South Africa. As they know where and when these ancestral languages were spoken, they are able to show that Bantu speakers passed through a savannah corridor in the rainforest that appeared around 4000 years ago. Additional studies on migration rates of the various Bantu speaking peoples into and out of the rainforest indicates that Bantu speakers prefer to live on the savannahs, as moves into the rainforest are typically delayed by around 300 years. This paper is an excellent example of using phylogenetics to make inferences about human history, in this case the impressive spread of the Bantu language family!

For the last study I want to discuss we need to move from Africa to Australia! This one came out on the 16th. Kevin Zhou and Claire Bowern write about "Quantifying uncertainty in the phylogenetics of Australian numeral systems", a phylogenetic comparative study of the number of numeral terms as well as their compositionality in the Pama-Nyungan languages of Australia. What is so amazing about Australian numeral systems (= terms for numbers such as 1, 2, 3, etc.) is that they are typically tiny: Zhou & Bowern write that most of the languages they investigate have a range of 3 to 20 numeral terms, with a median of 4! As we've seen above in some of the other studies, ancestral state estimations are used to infer the Proto-Pama-Nyungan numeral system, which probably had 4 numeral terms. They perform additional tests to determine whether there is a correlation in the compositionality of numeral terms for ‘3’ and ‘4’. Zhou & Bowern demonstrate that such a correlation indeed exists, i.e. if languages form '4' by 2 + 2, they are likely to also form '3' in a non-opaque manner, by adding 2 + 1. This is a great paper with some very cool figures - Figure 1 is very good for seeing how numeral systems have become smaller in some Pama-Nyungan subgroups and bigger in others.

Thank you for joining me in a journey around the world where we find these cool studies of language family histories and typological diversity featuring explanations in terms of historical change! If only every month could be like September 2015...

If I have missed anything else phylogenetic that appeared in the last three weeks, please comment and I shall update this post.

EDIT 1: even though it is not about linguistics, a paper introducing a new database on Austronesian subpernatural beliefs and practices that features several phylogenetic comparative analyses deserves mention here! It came out in PLoS One on the 23rd. The database is called Pulotu and is freely accessible here.

EDIT 2: as a special birthday gift for yours truly, a study on language macro-families came out in PNAS on the 24th. Jeremy Collins discusses it fully in this new post.

