back to homepage


John Esling
University of Victoria, Canada

The Larynx as an Articulator
7.8.  9:30-10:30

The larynx (the lower vocal tract) is a complex articulator, as is the tongue in the upper, oral vocal tract. While the tongue is a relatively freely mobile muscular hydrostat, the laryngeal articulator folds and unfolds like a fist, within a scaffold of cartilages, changing the dimensions of the airway postero-anteriorly, latero-medially, and from bottom to top. The vocal folds, ventricular folds, and aryepiglottic folds of the epilaryngeal tube, and the epilaryngeal tube itself, along with changes in larynx height, generate multiple periodic vibrations and complex modifications of lower-vocal-tract resonances, accounting for a broad range of contrastive auditory qualities in the languages of the world. Various instrumental phonetic imaging techniques (laryngoscopy, ultrasound, MRI) illustrate states of the larynx, phonation types, and linguistic realizations from many language families. The vast array of phonetic strategies that the laryngeal articulatory mechanism affords to sound systems should not surprise us, given that it is also the device that infants first control as they test their own phonetic production through the first months of life, so that we can say that “speech originates in the larynx/pharynx”. Even fundamental vowel quality cannot be fully assessed without reference to the auditory, articulatory, and acoustic effects of laryngeal posture.

John Esling is a former Secretary of the International Phonetic Association, former Editor of JIPA, and a Past-President of the International Phonetic Association. He learned phonetics at the University of Michigan with Ian Catford and Kenneth Pike and at the University of Edinburgh with David Abercrombie, John Laver, and James (Tony) Anthony. He taught at the University of Leeds, then at the University of Victoria in Canada, concentrating on auditory and articulatory phonetics, particularly the categorization of voice quality, of vocal register, and of the phonetic production and modelling of laryngeal and pharyngeal sounds. He is a co-author of Voice Quality: The Laryngeal Articulator Model and co-developer of the app – iPA Phonetics – and has been fortunate to visit and collaborate with so many knowledgeable, amenable, and generous colleagues around the world.

Andrea Ravignani
Max Planck Institute for Psycholinguistics, Netherlands & Aarhus University

The origins of rhythm and vocal learning: A comparative approach
8.8.  11:00-12:00

Who’s got rhythm? And why are we such chatty animals? Human music and speech are peculiar behaviors from a biological perspective: Although extremely common in humans, at first sight they do not seem to confer any direct evolutionary advantage. Many hypotheses try to explain the origins of acoustic rhythm capacities in our species, but few are empirically tested and compared. Because music and speech do not fossilize, and lacking a time machine, the comparative approach provides a powerful tool to tap into human cognitive history. Notably, homologous or analogous building blocks underlying human rhythm can be found across a few animal species and developmental stages. Hence, investigating rhythm across species is not only interesting in itself, but it is crucial to unveil music-like and speech-like behaviors present in early hominids. In this talk, I will discuss the major hypotheses for the evolution of vocal rhythmicity in humans and other animals, which link acoustic rhythms to vocal learning (a precursor to speech), gait, breathing, or chorusing. I will suggest how integrating approaches from ethology, psychology, neuroscience, modeling, voice sciences, and physiology is needed to obtain a full picture. I will then zoom in on some crucial species which are key to test alternative hypotheses on rhythm origins, with particular attention to the rhythm-vocal learning link. I will show how three strands of research - partly neglected until now - can be particularly fruitful in shedding light on the evolution of rhythm and vocal learning. I will present rhythm experiments in marine mammals, primates, and other species, suggesting that rhythm research in non-human animals can also benefit from ecologically-relevant setups, combining strengths and knowledge from human cognitive neuroscience and behavioral ecology. Second, I will discuss the interplay between vocal anatomy, learning, and development in harbor seal pups, arguing for their importance as model species for human speech origins. Finally, I will present human experiments where musical rhythm is created and evolves culturally due to cognitive and motoric biases, showing the importance of an interplay between biology and cultural transmission. These results suggest that, while some species may share one or more building blocks of speech and music, the ‘full package’ may be uniquely human.

Andrea Ravignani is a Professor at the Department of Human Neurosciences, Sapienza University of Rome, Italy. Until May 2023, he has been an Associate Professor at the Center for Music in the Brain, Aarhus University, Denmark & a W2 Independent Group Leader at the Max Planck Institute for Psycholinguistics, where he led the Comparative Bioacoustics Research Group.
Andrea has studied, researched and worked in several areas, including mathematics, biology, speech sciences, musicology, computer science and cognitive psychology - this multidisciplinarity is mirrored in his research team. Andrea’s research group at the MPI was highly interdisciplinary, featuring 10 scientists from many areas, including cognitive neuroscience, ethology, experimental psychology, linguistics, communication sciences, computer science, AI, bioacoustics, primatology and marine mammalogy.
Since the end of his PhD in 2014, Andrea has written approximately 100 works (journal articles, book chapters, etc.) published in Nature Human Behaviour, Cognition, Nature, PNAS, Current Biology, Science, Nature Communications, Music Perception, Trends in Cognitive Science, etc.
Recently, Andrea has been awarded an ERC Starting Grant, a HFSP Grant, and a Sapienza PI grant to investigate the origins of rhythm and vocal learning using a multi-species and multi-methods approach. He firmly believes in and supports kindness in science.

Titia Benders
University of Amsterdam, Netherlands

How do children acquire the other 98.5% of the world’s languages?
The case of 3 projects on Nepali stops, Akan ATR harmony, and (Saudi)-Arabic emphatics

9.8.  9:00-10:00

For a comprehensive understanding of how children around the world learn to speak and listen to their language(s), the study of language acquisition must diversify substantially (Aravena-Bravo, et al., under review; Cristia et al. accepted; Kidd & Garcia, 2022). This keynote responds to this need by sharing insights from projects on 3 lesser-studied languages, each with phonological elements whose acquisition cannot be studied in the currently estimated 2% of the world’s languages for which acquisition data are available (Kidd & Garcia, 2022).
Project 1) The realisation of the 4-way voicing contrast in Nepali Infant-Directed Speech: This lab-based study, originally lead by Sujal Pokharel, started by asking how the temporal cues to the Nepali 4-way voicing contrast are produced when speaking to infants. To our surprise, we didn’t observe the canonical combination of pre-voicing and breathiness for the (cross-linguistically rare) breathy stops, leading us on a path of discovery into changing cues or a changing system.
Project 2) The perception and processing of Akan ATR (Advanced Tongue Root) harmony by multilingual infants growing up in Ghana: Using a mobile infant perception lab across sites in Accra, the capital of Ghana, Paul Okyere Omane hopes to find out whether infants with Akan among their input languages prefer listening to words with ATR harmony and use the absence of such harmony as a cue to word boundaries. Results on the experimental perception tasks will be connected to estimates of the proportion of Akan (and thus ATR harmony) in children’s input.
Project 3) The acquisition of the plain-emphatic contrast by 3-to-6 year old Saudi-Arabic children: For this first acoustic-phonetic study into the acquisition of the doubly-articulated emphatic consonants, Anwar Alkhudidi has conducted a hybrid online/in-person production experiment. Contrary to impressionistic reports that the plain-emphatic contrast is acquired after 5 or 6 years of age, initial results provide some evidence of a contrast already in 3-year-olds.
In addition to sharing the discoveries from each project, I will reflect on the challenges involved in child language acquisition studies without lab facilities, language descriptions, or adult baseline data. I will close with suggestions on how we, as a field, can support research and researchers of the acquisition of lesser-studied languages.

Titia Benders is Assistant Professor in Linguistics at the University of Amsterdam (The Netherlands), where she completed her PhD in 2013 and returned in 2021 after a post-doc at Radboud University Nijmegen (The Netherlands), a Lecturer position at the University of Newcastle, and a 7-year (Senior) Lecturer position at Macquarie University Sydney (both in Australia). Combining research methods from phonetics and (developmental) psychology, Benders investigates developing phonological representations at the interface between input, perception, and production. Her main interest is the acquisition of segmental and prosodic representations by children between 6 months and 6 years of age, who acquire one or more languages, without or with hearing loss. Thanks to collaborations with junior colleagues, her work now addresses such research questions about the phonological elements in lesser-studied languages as well.

Pavel Trofimovich
Concordia University, Montreal, Canada

Beyond accent, attitudes, and native speakers: What might socially responsible second language speech research look like?
10.8.  11:00-12:00

The state of today’s world, reeling from climate emergencies and humanitarian catastrophes, reinforces the need for researchers to promote and engage in socially responsible research practices. What we do, as researchers and human beings, should ideally matter for those who come after us and will inherit the world we leave behind. However, what could socially responsible research look like? And how do we balance social responsibility with intellectual curiousity (and with the daily demands of academic jobs)? In this presentation, I will reflect on my own and my colleagues’ work in the field of second language, bilingual, and multilingual speech processing and learning, taking you on a personal journey as I struggle to reconcile theoretical curiousity with social relevance. In this presentation, I will take stock of several conceptual and methodological achievements by second language speech researchers in past years. I will then turn to the future and provide a view of possible new (or rediscovered old) agendas for second language speech learning, highlighting the dynamic, variable, multifaceted, and multimodal nature of speech learning and use. Above all, I will highlight the importance of socially relevant research practices which are useful to the daily lives of language speakers.

Pavel Trofimovich is Professor of Applied Linguistics in the Department of Education at Concordia University in Montreal, Canada. His research focuses on cognitive aspects of second language processing and learning, the acquisition of second language pronunciation and speaking skills by children and adults, sociolinguistic aspects of second language acquisition, and the teaching of second language pronunciation. He has published extensively in many top scholarly venues in the fields of language learning, cognitive psychology, and language teaching, with over 100 journal articles, chapters, and conference proceedings published, and is the author of three books on the use of cognitive psycholinguistic research methods in second language research and assessment of second language pronunciation. He has served as Associate Journal Editor (2012–2015) and as Journal Editor (2015–2019, 2022–2023) for Language Learning.

Jane Stuart-Smith
University of Glasgow, United Kingdom

What can speakers tell us about speech?
10.8.  16:20-17:20

Phonetic and phonological variation is systematic, structured, and informative to speakers and hearers in many ways and on many levels (e.g. Chodroff & Wilson, 2022). Sources of structured variability range from phonetic, linguistic and interactional factors, to those to do with speakers themselves, personal and social. Integrated views of understanding speech as intricately bound up with speakers in their local communities, past and present, are found in early formal phonetic observations (e.g. Rousselot 1891 in Demolin forthcoming, 2023). Abercrombie (e.g. 1967) theorized properties of speech relating to the speaker in terms of ‘indexicality’, alongside those for grammatical contrast and interactional function. More recently, sociophonetic approaches recognize how speech carries and constructs ‘social-indexical’ cues pointing to speakers’ social identities (e.g. Foulkes & Docherty 2006; Kendall et al forthcoming 2023). But what does social-indexicality look like in practice for the speech of a community? Does changing the scale of our sociophonetic scope alter the picture? And how does including social information about speakers extend our understanding of speech more generally? This talk will focus on three aspects of English speech examined within the local sociolinguistic context of the city of Glasgow, and the broader contexts of Scotland, the British Isles and North America. Synchronic and diachronic evidence will be drawn from existing and new acoustic data from sociophonetic projects including Sounds of the City (e.g. Stuart-Smith et al 2017) and SPeech Across Dialects of English (SPADE) (e.g. Sonderegger et al 2022), as well as Scottish social-articulatory phonetic data (e.g. Lawson et al 2014). Specifically, I will discuss what we can learn from:

  • Sibilants /s ʃ/, and speaker gender and dialect (e.g. Stuart-Smith 2020)
  • Rhotics and rhoticity, and speaker social class, ethnicity and dialect (e.g. Lawson et al 2018)
  • Patterns of vowel duration, especially the Scottish Vowel Length Rule and the Voicing Effect, and speaker gender and dialect (e.g. Rathcke & Stuart-Smith 2016; Tanner et al. 2020)

I will conclude by considering the issue of where the ‘social’ might reside for speakers, as a sometimes separable, but usually integral, part of their phonetic and phonological knowledge.

Jane Stuart-Smith has been Professor of Phonetics and Sociolinguistics at the University of Glasgow since 2013, first joining the University in 1997 as Lecturer in English Language, where she has worked with colleagues to develop the Glasgow University Laboratory of Phonetics (GULP). With members of GULP and collaborators outwith Glasgow, she considers the many relationships between speech and society, taking the rich linguistic variation in Scotland as the basis for her work (e.g. Sounds of the City), and more recently, phonetic and phonological variation over space and time in Englishes of the British Isles and North America (SPeech Across Dialects of English - SPADE). Jane also works closely with colleagues in Scotland to promote the public understanding of phonetics by developing accessible web resources for speech and accents (e.g. Seeing Speech; Dynamic Dialects).

Paul Boersma
University of Amsterdam, Netherlands

Praat in the next 30 years
11.8.  15:00-16:00

Paul Boersma received an MSc in physics from the University of Nijmegen in 1988 and a PhD in linguistics from the University of Amsterdam in 1998. Since 2005 he has been Professor of Phonetic Sciences at the University of Amsterdam. His research focuses on modelling and simulating the acquisition, evolution and typology of the production and comprehension of phonology and phonetics. For this he developed a bidirectional model of phonology and phonetics (BiPhon) in which the speaker and listener travel the same morphological, phonological and phonetic levels of representation, which are connected by symmetric constraints that are weighted or ranked, or by symmetric neural network connections. His further research involves the history of the Franconian tone systems. Boersma is also the designer and main author (with David Weenink) of Praat, the world’s most used computer program for the analysis and manipulation of speech.


GUARANT International spol. s r.o.
Českomoravská 19, 190 00  Prague 9, Czech Republic
Phone: +420 284 001 444
E-mail: | Web:

© 2022–2023 GUARANT International spol. s r. o.