CultureLab Seminar III/2026

Note taking

Evolving semantics and etymology

Place
Meeting room of the Institute Jean Nicod (ENS, 29 rue d'Ulm 75005 Paris)
Date
From 3.30 PM

The third session of the CultureLab will take place on Friday 19th June 2026 at 3.30PM in meeting room of the Institut Jean Nicod (groundfloor, 29 rue d'Ulm 75005). Alexandre François, Konstantin Henke and Mathieu Dehouck (LATTICE, CNRS, École Normale Supérieure, Université Sorbonne nouvelle) will present the EvoSem project which uses etymology to study the semantic evolution of concepts.


Abstract

Assessing semantic similarity through EvoSem, a catalogue of etymological links

Semantic similarity among concepts can be assessed using many different metrics. One of them is colexification, i.e. when two concepts can be expressed by the same form in certain languages (François 2008); e.g. FEEL and HEAR are colexified by Italian sentire. A more recent approach to assessing semantic closeness is dialexification (François f/c), i.e. when two meanings can be at least related through etymology. For example, the concepts FAMILY and CITY are semantically so distant that no language colexifies them together in synchrony; but they are still close enough that they can be “dialexified”, i.e. expressed by cognate words (descendants of the same root). Thus, Gothic gards FAMILY is cognate with Polish gród CITY, and both descend from the Proto-Indo-European root *gʰórdʰos. The same dialexification pattern {FAMILY : CITY} is found under Proto-Semitic *ʔahl-, with {Afar áhli FAMILY : Akkadian ālum CITY}. All in all, dialexification captures broader semantic networks than what colexifi-cation can do on its own.

These reflections led to the creation of EvoSem (tiny.cc/EvoSem), an empirical catalogue of dialexification patterns across the world’s languages (Dehouck et al. 2023). In this joint talk, Alex François will present the main tenets underlying the creation of EvoSem. Konstantin Henke will present the challenges of harvesting etymological dictionaries in an automated way, and will show the value of creating etymology trees to represent the history of words ‒ including borrowings. Finally, Mathieu Dehouck will show how such a resource can be used to detect interesting pairs of concepts – i.e. concepts dialexified more than expected in a given context – and will discuss why such pairs can appear.

References
Dehouck, Mathieu, Alexandre François, Siva Kalyan, Martial Pastor & David Kletz. 2023. EvoSem: A database of polysemous cognate sets. In Proceedings of the 4th Workshop on Computational Approaches to Historical Language Change, 66–75. Singapore: Association for Computational Linguistics. [link]
François, Alexandre. 2008. Semantic maps and the typology of colexification: Intertwining poly-semous networks across languages. In M. Vanhove (ed.), From Polysemy to Semantic change: Towards a typology of lexical semantic associations, 163–215. Amsterdam: Benjamins. [link]
— (f/c) Recent advances in lexical typology: Comparing patterns of lexification. In A. Aikhenvald & RMW Dixon (eds.). The Cambridge Handbook of Linguistic Typology, 2nd ed. Cambridge: CUP.