Are you developing a voice assistant or solutions on top of an existing service? Our datasets are perfect for voice assistants in Estonian. We have datasets available for wake words, skill commands, and voice commands.
Train, fine-tune, and test your voice assistant using voice data from thousands of people in Estonian. By using our datasets you can improve the voice assistant to recognize native and non-native speakers and/or test that it works for different demographics and regions.
Access any of the following voice assistant activation data or voice commands for testing or training:
- Amazon Alexa dataset.
- Siri dataset.
- Google Assistant dataset.
- Cortana dataset.
The datasets consist of speech from thousands of Estonian speakers using voice assistants.
The Estonian data is recorded by people in the following regions: Estonian Language: native and non-native EE.
The datasets contains audio clips of people recording themselves speaking voice assistant commands and wake words, up to 10 minutes of speech per person. Wake words are the phrases that activate the voice assitant while voice commands are requests for the voice assitant to perform a certain action.
Recordings vary in length with an average of 3-second clips.