Update: SwissDial data set version 1.1 now available with additional 7726 recorded GR sentences.
The first annotated parallel corpus of spoken Swiss German across 8 major dialects (AG, BE, BS, GR, LU, SG, VS, ZH). The data set includes around 3 hours of high quality audio per dialect together with Swiss German and High German transcripts.
More details about the data set can be found in the paper and on the project webpage:
SwissDial paper
Project Webpage
When using the SwissDial data set for research purpose, please cite:
SwissDial paper
SwissDial dataset by ETH Zürich is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.