Navigation auf uzh.ch

Suche

Institut für Computerlinguistik

Text Technology/Digital Linguistics colloquium FS 2025

Time & Location: every 2 weeks on Tuesdays from 10:15 am to 12:00 pm in room BIN-2-A.10.

Online participation via the MS Teams Team CL Colloquium is also possible.

Responsible: Anne Göhring

Colloquium Schedule

 
18.02.2025 Martin Volk  
04.03.2025 (Chiara Tschirner: postponed)

Pius Peter Hugo von Däniken

18.03.2025

Hanxu Hu

Yang Tian
01.04.2025 Sophia Conrad

Ahmet Yavuz Uluslu

15.04.2025

Ghassen Karray

Deborah Jakobi

29.04.2025
Janis Goldzycher
Teodora Vukovic's group
13.05.2025

Kirill Semenov

Zifan Jiang
27.05.2025 Yingqiang Gao
Lena Sophia Bolliger

 

18 Feb 2025

Martin Volk: Bullinger Digital: Machine Translation for Mixed Language 16th-Century Letters

The project “Bullinger Digital” deals with the letter collection of Heinrich Bullinger: 12,000 letters in Latin and Early New High German from the 16th century. We will give an overview of the code-switching between these languages and present a novel visualisation for profiling the language mix between correspondence partners over time.

In this project we extensively experimented with GPT models, Gemini and Perplexity for Machine Translation of the letters into modern German and English. We found that these LLMs currently offer the best quality for machine translations across 500 years. In addition, these LLMs allow for interesting knowledge injections. When translation suggestions for single words or phrases are available (e.g. from footnotes in the edition or from external lexicons or name lists), we can add them to the prompt and thus improve the translation results.

LLMs also show impressive abilities for syntax analysis of Latin sentences. We used them to analyse triadic greeting sentences (e.g. a writer sending regards to the addressee’s collaborators) and to determine if the persons mentioned in these sentences are senders or receivers of the greetings. 

In the talk we will present our experiments and results from these studies.

4 Mar 2025

(postponedChiara Tschirner: Visual search as a predictor for reading comprehension and fluency: Insights from 'Lesen im Blick'

Previous research suggests that efficient visual search is tied to strong reading ability in children (e.g., Ferretti et al. 2008). So far, the focus has been on reading abilities in terms of reading fluency rather than reading comprehension. As part of 'Lesen im Blick', a longitudinal study on reading development in preschool and primary school children, we are able to further investigate this connection by using eye-tracking and include other markers of reading ability, such as reading comprehension. In this talk, I will present ongoing work in this direction, as well as an update on the overall project.)

Pius Peter Hugo von Däniken: System Dependence in Machine Translation Metrics: A Challenge for Fair Evaluation

Automated metrics are widely used for evaluating machine translation (MT) systems, offering a scalable alternative to human assessments. However, despite their high correlation to human judgements there remain open challenges. In this talk, I will introduce an under-explored concept that we call system dependence: simply put, the same metric score does not correspond to the same human score for every system. This can contribute to inconsistencies in rankings and raise questions of fairness as current evaluation protocols crucially rely on the assumption that a metric will treat all systems equally. I will present a method for measuring system dependence and illustrate its application to recent WMT metrics tasks.

18 Mar 2025

Hanxu Hu: From Policy-Gradient RL to In-Context RL: Can LLMs Learn From Feedback in Context?

In-Context Learning (ICL) for Large Language Models (LLMs) have demonstrated impressive capabilities in learning from in-context examples, yet their potential to learn from textual feedback and verbalized rewards remains under-explored. Traditional In-Context Learning provides question-answer pairs for LLMs learning to handle similar tasks in test-time, functioning as a proxy for supervised fine-tuning. Meanwhile, Reinforcement Learning with Human Feedback (RLHF) has revolutionized the field, powering products like ChatGPT and O1 using Policy-Gradient RL algorithms which makes LLMs effectively learn from human or environmental feedback. Therefore, this raises a fundamental question: Can LLMs learn from feedback solely through context, similar to ICL? In this project, we investigate this question by examining in-context RL in mathematical reasoning tasks and provide empirical insights into this novel learning paradigm.

Yang Tian: Investigating Disability Representation and Bias in Text-to-Image Multimodal Models

Text-to-image (T2I) generative models have made remarkable progress in producing high-quality visual content from textual descriptions. However, these models exhibit biases in their representations of marginalized communities, particularly people with disabilities. Existing models frequently misrepresent disability by reinforcing stereotypes—such as defaulting to wheelchair users as the primary depiction—while underrepresenting diverse and nuanced disabilities, including cognitive and invisible conditions.
To examine these biases, I analyze images generated by Stable Diffusion XL and DALL·E 3 using a structured prompt design. The first analysis quantifies bias by comparing the similarity between images generated from general prompts (e.g., “photo of a person with a disability”) and specific prompts referring to different disability types. The second analysis explores how bias mitigation strategies shape disability portrayals, particularly regarding sentiment and emotional framing.
By critically assessing these biases and the effectiveness of bias mitigation techniques, this presentation underscores the need for ongoing assessment of generative models to ensure fair and inclusive disability representations.

1 Apr 2025

Sophia Conrad: Linguistic Variation Between Human-Written and Machine-Generated Text

The goal of this work-in-progress study is to perform a thorough linguistic analysis of machine-generated text (MGT) to determine how closely it aligns with human-written text (HWT) across a broad range of registers. The main research question thus is: Can LLMs accurately replicate the distinct registers of human writing? To address this question, I am using a geometric multivariate analysis which is an adaptation of Biber’s multidimensional analysis framework, a widely used method to study register variation. As a first step, a dataset of HWT and MGT is created. HWT samples are taken from the Corpus of Contemporary American English, a large and widely used corpus of one billion words with texts from eight different registers. Comparable MGT samples are generated using four popular LLMs (GPT-4o, GPT-4o-mini, Llama3.3, and Deepseek-r1) under two different conditions: (1) prompting the models with the first sentence of the HWT, along with its register and desired length, and (2) providing half of a HWT and instructing the models to complete it with a certain length restriction. This allows to approach two secondary research questions: Can LLMs better replicate human written registers when given a longer example? And are there differences between the LLMs? The next step is the comparison of linguistic features in the two text types using GMA. To that end, the Multi Feature Tagger of English (MFTE) is employed, which implements over 200 linguistic features. The study identifies key dimensions of linguistic variation, and examines whether MGT exhibits statistically significant differences from HWT across these dimensions.

Ahmet Uluslu: Authorship Analysis in the Era of Large Language models

15 Apr 2025

Ghassen Karray: TBA

Deborah Jakobi: TBA

29 Apr 2025

Janis Goldzycher: TBA

Teodora Vukovic: TBA

13 May 2025

Kirill Semenov: TBA

Zifan Jiang: TBA

27 May 2025

Yingqiang Gao: TBA

Lena Bolliger: TBA