This is a tiny robot that uses voice recognition on the Raspberry Pi and talks with human. By the end of this project completion, we should have a working robot that understand and answers your oral question.
This is going to be achieved using some free API available from different sources. It basically converts our spoken question into to text, process the query and return the answer, and finally turn the answer from text to speech. The project can be divided into four parts:
- Convert the speech of the user to text
- Use the text to process a query to a server
- Collect the answer from the server and process the data
- Convert the result text to speech and transfer to audio output
Speech to Text
Speech recognition can be achieved in many ways on Linux (so on the Raspberry Pi), the easiest way is to use Google voice recognition API. The accuracy is very good in Google API, even when the user has strong accent as well.
Processing the query is just like “Google-ing” a question, but what we want is when we ask a question, only one answer should be returned. “Wolfram Alpha” is a good choice here.
Text To Speech
From the processed query, we are returned with an answer in text format. What we need to do now is turning the text to audio speech. There are a few options available, but a good choice is Google’s speech service due to its excellent quality.
Block diagram of Piwi