Datasets for your speech recognition solution in Swedish
Improve your Swedish automatic speech recognition models or deploy new models in days using our speech and voice recognition dataset. The Swedish datasets you can choose from are scripted and non scripted recordings with one or two people speaking. Tell us what data you need and we will include only the data that fits your use case and needs, whether that is specific background noise levels, speakers from certain regions, speakers of specific age groups, gender, or nativitiy.
We can provide you with thousands of hours of speech recorded by tens of thousands unique speakers. With our high-quality training datasets, you can gain competitive advantage over your competitors, reduce time to market, and improve word error rate of your models.
Our speech recognition datasets in Swedish consists of native and non-native speakers from the following regions:
Swedish language: native SE, FI, and non-native.
Speech recognition data specifications
The Swedish The datasets contain transcribed and segmented audio clips of people talking about various topics or reading sentences, with up to two hours of speech per person. The speech is captured using mobile phones and laptops from a diverse crowd of speakers representing all ages and backgrounds. Because of that, the dataset is perfect for ASR and voice assistant use cases using mobile devices.
Recordings vary in length depending on type of recording. Scripted speech recordings are up to 30 seconds while two people conversations are of up to one hour long. The recordings are transcribed and segmented by speaker, noise, music, and overlapping speech.
Automatic speech recognition (ASR) is also known as speech-to-text and voice recognition.