Navigation auf uzh.ch

Suche

Department of Computational Linguistics

Visualizing Document Embeddings

Summary

We want to explore explore methods for reducing high-dimensional data to two or three dimensions for visualization, including model distillation, autoencoders, and classical techniques like PCA. We address the question: which technique is the "best"? Defining "best" through quality (preservation of original data characteristics) and efficiency (computational cost), we aim to provide a comprehensive comparison of these dimensionality reduction methods for visualising document embeddings.

If interested, please send an email addressed to all three of us for maximum visibility.

Requirements

  • Machine Learning
  • Python/PyTorch