SciLang: Ambiguity in Language Workshop

SciLang: Ambiguity in Language Workshop

Language can be ambiguous. For example "Police help dog bite victim" is ambiguous, which could mean "police help a dog to bite a victim" or "police helped a victim who was bitten by a dog (i.e., a dog-bite victim)". Text that is more ambiguous requires readers to work harder to understand its meaning, and may even lead to misinterpretation. The prevalence of ambiguous text is only increasing over time as hastily-written communication becomes more prevalent, thanks to the rise of e-mail, social networks, and instant messaging platforms.

The goal of this workshop is to consider ambiguity in text, and to ground our considerations in actual methodological practices that can support the study of ambiguity. To do this, we will discuss the nature of ambiguity in text, including how it arises and the problems it can cause. We’ll then engage in an exercise as a group, in which participants work together to comb through a data set of e-mails to document instances of textual ambiguity, for the purpose of considering the following questions: How often does ambiguity occur in real communication, what does it look like, and how often is it harmful? Participants will have the opportunity to probe these research questions, and along the way will learn fundamental research skills about how to annotate corpora to address scientific questions. To conclude the workshop we will analyze the data collected during the workshop, and discuss how machine learning might be used to automate the detection of ambiguities.