Schedule

Day 1: 29th May 2022

Registration 12:00 - 12:30

12:30 - 13:30

Welcome Lunch and Kickoff for the LT-Bridge Summer School


13:30 - 15:00

Personalisation

Prof Owen Conlan, TCD

Modelling a user's engagement with a complex system can allow that system to tailor how it interacts with and functions for them. However, many user models are stored in a form that are almost impossible for users to interpret, presenting a fundamental barrier to how a user may control how the system functions for them. Scrutability is a principle from personalisation research that strives towards user scrutable and controllable systems, promoting user understandable models and transparent methods of control.

This session will introduce Personalisation as a field of research. It will discuss user, domain and content modelling, making specific references to a number of case studies, including tailoring Technology Enhanced Learning (TEL) to individual needs; supporting users in identifying disinformation online; and designing proactive intelligent personal agents that can act on our behalf. The role of scrutability as a means of promoting user understanding, reflection, regulation and system control will be a recurring theme in this session and in the case studies examined.

Break 15:00 - 15:30

15:30 - 17:00

Dialog, Dialog Systems and Chatbots: Session 1

Dr Emer Gilmartin, TCD

The two-session module will give an overview of dialog technology, from simple chatbots to advanced research prototypes, discuss the technologies involved, and focus on areas of interest to the participants.

The first session will cover:

  • Basics of dialog - structure and pragmatics

  • Natural language interfaces

The second session will focus on 3 topics of interest to participants. Topics can be chosen by participants in first session

BBQ (18:00 - 21:00)

Day 2: 30th May 2022

9:00 - 10:30

Hands-on with neural machine translation: a peek inside (Part 1)

Prof Mikel L. Forcada

Course attendees will install the Transformers neural machine translation (NMT) library and HuggingFace pretrained models for one or more language pairs of their choice on their own laptops. After an introduction to the principles of transformer NMT training and functioning, they will fire up the models for their languages of choice, and use simple commands in an interactive Python shell to study how input and output texts are processed in terms of sub-words, how to control decoding, etc.

Python experience is useful but not necessary. Registered participants will receive the course handout in advance to prepare their laptops for the session.


Break 10:30 - 11:00

11:00 - 12:30

Hands-on with neural machine translation: a peek inside (Part 2)

Prof Mikel L. Forcada

Lunch 12:30 - 13:30

13:30 - 15:00

Research Development

A series of 3 short talks aimed at equipping early career researchers with skills needed to understand and navigate the research funding landscape and accelerate their research careers. Topics covered

  • Early-Stage Career Development: Establishing a research career through funding

  • Applying for research funding for new researchers

  • How to read a funding call

Break 15:00 - 15:30

15:30 - 17:00

Poster Boaster

Pitch your research ahead of the poster session and drinks reception

Poster Session & Drinks Reception (18:00 - 20:00)

Day 3: 31st May 2022

9:00 - 10:30

Dialog, Dialog Systems and Chatbots: Session 2

Dr Emer Gilmartin, TCD

The two-session module will give an overview of dialog technology, from simple chatbots to advanced research prototypes, discuss the technologies involved, and focus on areas of interest to the participants.

The first session will cover:

  • Basics of dialog - structure and pragmatics

  • Natural language interfaces

The second session will focus on 3 topics of interest to participants. Topics can be chosen by participants in first session

Break 10:30 - 11:00

11:00 - 12:30

Anaphora and Coreference Resolution

Dr Yufang Hou, IBM Research/TU Darmstadt

Anaphora is the linguistic phenomenon of referring back to a previously mentioned entity or event in a document. Coreference refers to the phenomenon that multiple expressions in a text refer to the same entity or event. Both anaphora resolution and coreference resolution are challenging tasks in discourse processing. In this lecture, I’ll first give a brief introduction about linguistic background of anaphora and coreference. Then I will review the main research questions in anaphora resolution and coreference resolution, discussing common sub-tasks, benchmarks, evaluation metrics, and recent advances. Next, I’ll talk about a few recent works on bridging resolution in detail. In the final discussion session, I will invite participants to share their thoughts on the topic.

Lunch 12:30 - 13:30

13:30 - 15:00

A Crash Course in Distributional Semantics: Probing Natural Language Embeddings for Meaning and Linguistic Structure

Filip Klubicka

The semantic relatedness of words has two key dimensions: it can be based on taxonomic information or thematic, co-occurrence-based information. These are captured by different language resources—taxonomies and natural corpora—from which we can build different computational meaning representations that are able to reflect these relationships. Vector representations are arguably the most popular meaning representations in NLP, encoding information in a shared multidimensional semantic space and allowing for distances between points to reflect relatedness between items that populate the space.

However, research has shown that other, non-semantic linguistic structures also tend to emerge in vector representations of language, without being explicitly instructed to encode them. With the aim of interpreting embedding models and meaning representations, the notion of probing has gained considerable traction in the NLP community, and has allowed for an improved understanding of how different types of linguistic information are encoded in vector space, giving provide valuable insight to the field of model interpretability and furthering our understanding of different encoder architectures.

Break 15:00 - 15:30

15:30 - 17:00

Giorraíonn BERT Bóthar: A collection of benchmark datasets and implementations for Irish NLP tasks

Prof Kevin Scannell, Saint Louis University

I have been working on developing Irish language technology for almost 25 years, and over that time I have accumulated a large number of datasets used for training models to perform various important NLP tasks: machine translation, language modeling, spelling and grammar correction, etc., with some, but not all, publicly available. I will begin by surveying some of this work, and laying out what I view as key priorities for language communities seeking to develop advanced language technologies. I will then introduce a new resource that brings together all of these datasets in one place, together with baseline implementations of the various tasks that others can build on. This is similar in spirit to efforts like Papers with Code and nlpprogress.com, although we have taken steps to try and mitigate some of the negative influence that so-called "leaderboard culture" has had on NLP research for English and other major languages.

Conference Dinner

Marinas Restaurant at the Galmont Hotel (https://g.page/thegalmont?share) at 20.00



Day 4: 1st June 2022

9:00

Full Day Tour to the Aran Islands (Inis Oírr) and the Cliffs of Moher