Skip navigation

Text Encoder Model for Romansh Idioms

Supervisors: Annette Rios, Jannis Vamvas

Summary

Train a BERT-like model on 130M tokens of Romansh text and evaluate your model on common NLP tasks.

Your tasks:

Fine-tune pretrained model(s) on Romansh data
Evaluate on NLP tasks, e.g. text classification or named entity recognition (might have to create a test set)
Present results at the SwissText 2027 conference

Expected outcome:

A BERT-like model for Romansh idioms

You will deepen the following skills:

HuggingFace Transformers

Python/PyTorch