Text Encoder Model for Romansh Idioms
Supervisors: Annette Rios, Jannis Vamvas
Summary
Train a BERT-like model on 130M tokens of Romansh text and evaluate your model on common NLP tasks.
Your tasks:
- Fine-tune pretrained model(s) on Romansh data
- Evaluate on NLP tasks, e.g. text classification or named entity recognition (might have to create a test set)
- Present results at the SwissText 2027 conference
Expected outcome:
A BERT-like model for Romansh idioms
You will deepen the following skills:
HuggingFace Transformers
Python/PyTorch