Bulgarian voice asisstant and voice commands dataset
Train, fine-tune, and test your voice assistant using voice data from thousands of people in Bulgarian. By using our datasets you can improve the voice assistant to recognize native and non-native speakers and/or test that it works for different demographics and regions.
Access any of the following voice assistant activation data or voice commands for testing or training:
- Amazon Alexa dataset.
- Siri dataset.
- Google Assistant dataset.
- Cortana dataset.
The datasets consist of speech from thousands of Bulgarian speakers using voice assistants.
Regions and language
The Bulgarian data is recorded by people in the following regions: Bulgarian language: native and non-native BG.
The datasets contains audio clips of people recording themselves speaking voice assistant commands and wake words, up to 10 minutes of speech per person. Wake words are the phrases that activate the voice assitant while voice commands are requests for the voice assitant to perform a certain action.
Recordings vary in length with an average of 3-second clips.