Wednesday, September 27, 2017

Linguistic map making: Drawing polygons

Hedvig has written on how Ethnologue has become even more restricted than it already was, and what resources are out there that could be used instead. One of the things I miss from Ethnologue are its maps - although at least recently it was still possible to access most of these, by downloading them instead of viewing them on your browser. In her post, Hedvig points out that Langscape can be used instead, and that's all great.

But what if you wanted to draw a map yourself? Especially one which you intend to publish? Some institutions may have access to the World Language Mapping System (WLMS), which lies at the core of Ethnologue's (and Langscape's) maps, and was made by Global Mapping International (which recently was closed, and now the WLMS is back formally with SIL). I'm not sure about the details (and the user agreement parts of the WLMS website are down), but paying a lot of money for the WLMS must enable users to draw their own maps and publish them, as long as they cite the source.

Not everyone has access to the World Language Mapping System, and even if you do, it is very likely that your specific needs are not covered by it. For example, have a look at the following map of the border between the Central African Republic, South Sudan, and the Democratic Republic of the Congo. As you can see, where these three countries meet in the center of the map, Zande, one of the biggest Ubangi languages, is spoken on all three sides of the border.

source: http://www.langscape.umd.edu

However, there are some Bantu languages spoken in this border area. One of them is Homa (hom), and as you can see by its location on Glottolog and comparing it with the langscape map, it simply does not have a polygon in the World Language Mapping System / Ethnologue maps.

source: http://glottolog.org, http://glottolog.org/resource/languoid/id/homa1239

Homa is a small, underdescribed Bantu language, which according to Sommer (1992: 352) may be threatened with extinction. The only extremely sketchy description of it is Santandrea (1963), who describes animacy-based agreement on adjectives, and suggests a heavily attrited gender system - something rather unusual for a Bantu language, with their generally extensive and healthy gender systems. This is the immediate reason for this post - together with Francesca Di Garbo I am looking at gender systems in Bantu, and I would very much like to be able to plot Homa on a map, not just with a point as in Glottolog, but with a polygon that I can color to indicate its special characteristics. A polygon rather than a point also makes far more clear that this language community is spoken in Zande country, far away from most of the rest of the Bantu languages.

Turns out, there is an extremely easy way to do this. One can use Google Earth to draw polygons anywhere on the world's surface, save them, and load them into R to make nice maps. The link to Google Earth is here (use within browser, wants Chrome), but you can download it here.

Once you open the Google Earth application, you can draw a polygon with the 'draw polygon' tool in the toolbar above the map. While the window is open, you can make the polygon by clicking on the map. Then you name it and save it as a kml file - described in much more detail here. This is the polygon I drew for Homa, see explanation below:

source: Google Earth

The location of Homa speakers according to Glottolog is close to Nagasi. Santandrea (1948: 81) states speakers of the language can be found in Tombora, and Sommer (1992: 352) puts their location "around towns of Mopoi and Tambura". As you can see on the Glottolog map, this is an area just north of where Glottolog puts the centroid for Homa. So, using Google Earth I draw a kind of oblong shape around these towns, the northwestern one being Tambura, and saved the polygon as Homa.kml. I don not know why there is a discrepancy between these sources and the Glottolog point, that is a story for another time.

Next, we can read the .kml file into R, and place it on a map. Please see code below.

# a libary you need to read in .kml files
library(rgdal)
Homa <- readOGR(dsn="Homa.kml")

# a libary you need to make the map
library(mapdata)

# plotting the map
map("world2Hires", xlim=c(23, 31), ylim=c(1, 8), boundary=TRUE)
map.axes()
map.scale(cex=0.8)

# putting in country names so we can situate them
text(x = 30, y = 7.5, "South Sudan")
text(x = 26, y = 4.5, "Democratic Republic of Congo")
text(x = 24.5, y = 7, "Central African Republic")

# plot the Homa polygon

plot(Homa, col ="magenta", add=TRUE)

The resulting plot looks like this:
This is only a very partial solution. If you wanted to draw a big map with many languages on it, it would be an enormous amount of work to go through the literature and surveys on where different languages are spoken. This work was already done, at least in part, by Ethnologue / the World Language Mapping System, and it is rather sad to do such work twice, or trice, etc. However, as the Homa case points out, data on where languages are spoken may be missing in Ethnologue, or may be incomplete, or no longer correct. Especially when you know a particular area in detail, it may be worth drawing your own map, and Google Earth + R makes this very easy. Of course, it would be even better to use actual data to draw ethno-linguistic maps, and not a 70-year old description, but for some areas of the world, that is something only for the very far future.

References

Santandrea, Stefano. 1948. Little Known Tribes of the Bahr El Ghazal. Sudan Notes and Records XXIX. 78-106.

Santandrea, Stefano. 1963. Short Notes on the Bɔdɔ, Huma and Kare Languages. Sudan Notes and Records 44. 82-99.

Sommer, Gabriele. 1992. A Survey on Language Death in Africa. In Brenzinger, Matthias (ed.), Language Death: Factual and Theoretical Explorations with Special Reference to East Africa, 301-413. Berlin/New York: Berlin: Mouton de Gruyter.

Tuesday, September 19, 2017

Public service announcement: list of databases and more


Public service announcement: there are website that keep a well-curated list of things that are useful to linguistics researchers and students, including the following:
It would appear that some don't know about these lists, so now you know/are reminded :).

Lists are good, and instead of reinventing them you can look through these and add to them. For more hopefully useful stuff like this, go here.