Data Access


Update: SwissDial data set version 1.1 now available with additional 7726 recorded GR sentences.

The first annotated parallel corpus of spoken Swiss German across 8 major dialects (AG, BE, BS, GR, LU, SG, VS, ZH). The data set includes around 3 hours of high quality audio per dialect together with Swiss German and High German transcripts.

More details about the data set can be found in the paper and on the project webpage:
SwissDial paper
Project Webpage

When using the SwissDial data set for research purpose, please cite:
SwissDial paper

Creative Commons License
SwissDial dataset by ETH Z├╝rich is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.