Text Technology/Digital Linguistics colloquium FS 2025

Time & Location: every 2 weeks on Tuesdays from 10:15 am to 12:00 pm in room BIN-2-A.10.

Online participation via the MS Teams Team CL Colloquium is also possible.

Colloquium Schedule

18.02.2025	Martin Volk
04.03.2025	(Chiara Tschirner: postponed)	Pius Peter Hugo von Däniken
18.03.2025	Hanxu Hu	Yang Tian
01.04.2025	Sophia Conrad	Ahmet Yavuz Uluslu
15.04.2025	Ghassen Karray	Deborah Jakobi
29.04.2025	Janis Goldzycher	Teodora Vukovic's group
13.05.2025	Kirill Semenov	Zifan Jiang
27.05.2025	Yingqiang Gao	Lena Sophia Bolliger

18 Feb 2025

Martin Volk: Bullinger Digital: Machine Translation for Mixed Language 16th-Century Letters

The project “Bullinger Digital” deals with the letter collection of Heinrich Bullinger: 12,000 letters in Latin and Early New High German from the 16th century. We will give an overview of the code-switching between these languages and present a novel visualisation for profiling the language mix between correspondence partners over time.

In this project we extensively experimented with GPT models, Gemini and Perplexity for Machine Translation of the letters into modern German and English. We found that these LLMs currently offer the best quality for machine translations across 500 years. In addition, these LLMs allow for interesting knowledge injections. When translation suggestions for single words or phrases are available (e.g. from footnotes in the edition or from external lexicons or name lists), we can add them to the prompt and thus improve the translation results.

LLMs also show impressive abilities for syntax analysis of Latin sentences. We used them to analyse triadic greeting sentences (e.g. a writer sending regards to the addressee’s collaborators) and to determine if the persons mentioned in these sentences are senders or receivers of the greetings.

In the talk we will present our experiments and results from these studies.

4 Mar 2025

(postponedChiara Tschirner: Visual search as a predictor for reading comprehension and fluency: Insights from 'Lesen im Blick'

Previous research suggests that efficient visual search is tied to strong reading ability in children (e.g., Ferretti et al. 2008). So far, the focus has been on reading abilities in terms of reading fluency rather than reading comprehension. As part of 'Lesen im Blick', a longitudinal study on reading development in preschool and primary school children, we are able to further investigate this connection by using eye-tracking and include other markers of reading ability, such as reading comprehension. In this talk, I will present ongoing work in this direction, as well as an update on the overall project.)

Pius Peter Hugo von Däniken: System Dependence in Machine Translation Metrics: A Challenge for Fair Evaluation

Automated metrics are widely used for evaluating machine translation (MT) systems, offering a scalable alternative to human assessments. However, despite their high correlation to human judgements there remain open challenges. In this talk, I will introduce an under-explored concept that we call system dependence: simply put, the same metric score does not correspond to the same human score for every system. This can contribute to inconsistencies in rankings and raise questions of fairness as current evaluation protocols crucially rely on the assumption that a metric will treat all systems equally. I will present a method for measuring system dependence and illustrate its application to recent WMT metrics tasks.

18 Mar 2025

Hanxu Hu: From Policy-Gradient RL to In-Context RL: Can LLMs Learn Feedbacks From Context?

In-Context Learning (ICL) for Large Language Models (LLMs) have demonstrated impressive capabilities in learning from in-context examples, yet their potential to learn from textual feedback and verbalized rewards remains under-explored. Traditional In-Context Learning provides question-answer pairs for LLMs learning to handle similar tasks in test-time, functioning as a proxy for supervised fine-tuning. Meanwhile, Reinforcement Learning with Human Feedback (RLHF) has revolutionized the field, powering products like ChatGPT and O1 using Policy-Gradient RL algorithms which makes LLMs effectively learn from human or environmental feedback. Therefore, this raises a fundamental question: Can LLMs learn from feedback solely through context, similar to ICL? In this project, we investigate this question by examining in-context RL in mathematical reasoning tasks and provide empirical insights into this novel learning paradigm.

Institut für Computerlinguistik

Quicklinks und Sprachwechsel

Hauptnavigation

Text Technology/Digital Linguistics colloquium FS 2025

Colloquium Schedule

18 Feb 2025

Martin Volk: Bullinger Digital: Machine Translation for Mixed Language 16th-Century Letters

4 Mar 2025

(postponedChiara Tschirner: Visual search as a predictor for reading comprehension and fluency: Insights from 'Lesen im Blick'

Pius Peter Hugo von Däniken: System Dependence in Machine Translation Metrics: A Challenge for Fair Evaluation

18 Mar 2025

Hanxu Hu: From Policy-Gradient RL to In-Context RL: Can LLMs Learn Feedbacks From Context?

Yang Tian: TBA

1 Apr 2025

Sophia Conrad: TBA

Ahmet Uluslu: TBA

15 Apr 2025

Ghassen Karray: TBA

Deborah Jakobi: TBA

29 Apr 2025

Janis Goldzycher: TBA

Teodora Vukovic: TBA

13 May 2025

Kirill Semenov: TBA

Zifan Jiang: TBA

27 May 2025

Yingqiang Gao: TBA

Lena Bolliger: TBA