PÂTÉ: A Corpus of Temporal Expressions for the In-car Voice Assistant Domain


The recognition and automatic annotation of temporal expressions (e.g. “Add an event for tomorrow evening at eight to my calendar”) is a key module for AI voice assistants, in order to allow them to interact with apps (for example, a calendar app). However, in the NLP literature, research on temporal expressions has focused mostly on data from the news, from the clinical domain, and from social media. The voice assistant domain is very different than the typical domains that have been the focus of work on temporal expression identification, thus requiring a dedicated data collection. We present a crowdsourcing method for eliciting natural-language commands containing temporal expressions for an AI voice assistant, by using pictures and scenario descriptions. We annotated the elicited commands (480) as well as the commands in the Snips dataset following the TimeML/TIMEX3 annotation guidelines, reaching a total of 1188 annotated commands. The commands can be later used to train the NLU components of an AI voice assistant.

Proceedings of the 12th International Conference on Language Resources and Evaluation (LREC 2020)
Alessandra Zarcone
Alessandra Zarcone
Professor of Language Technologies and Cognitive Assistants

Computational linguist with a background in NLP and in psycholinguistics, working on human-machine interaction.