Latvian voice asisstant and voice commands dataset
Train, fine-tune, and test your voice assistant using voice data from thousands of people in Latvian. By using our datasets you can improve the voice assistant to recognize native and non-native speakers and/or test that it works for different demographics and regions.
Access any of the following voice assistant activation data or voice commands for testing or training:
- Amazon Alexa dataset.
- Siri dataset.
- Google Assistant dataset.
- Cortana dataset.
The datasets consist of speech from thousands of Latvian speakers using voice assistants.
Regions and language
The Latvian data is recorded by people in the following regions: Latvian language: native and non-native LV.
The datasets contains audio clips of people recording themselves speaking voice assistant commands and wake words, up to 10 minutes of speech per person. Wake words are the phrases that activate the voice assitant while voice commands are requests for the voice assitant to perform a certain action.
Recordings vary in length with an average of 3-second clips.