Header

Search

Text Encoder Model for Romansh Idioms

Supervisors: Annette RiosJannis Vamvas

Summary

Train a BERT-like model on 130M tokens of Romansh text and evaluate your model on common NLP tasks.

Your tasks:

  1. Fine-tune pretrained model(s) on Romansh data
  2. Evaluate on NLP tasks, e.g. text classification or named entity recognition (might have to create a test set)
  3. Present results at the SwissText 2027 conference

Expected outcome:

A BERT-like model for Romansh idioms

You will deepen the following skills:

HuggingFace Transformers

Python/PyTorch