Text-to-speech (TTS) engines are widely used in assistive technology, communication systems, and automation in various industries. They’re essential for individuals with visual impairments or reading disabilities, enabling them to access written content audibly.
TTS also assists students with dyslexia, language learners, and individuals with learning difficulties. Virtual assistants like Siri, Alexa, and Google Assistant also use it to respond to user requests, while GPS devices and apps provide spoken navigation. Additionally, TTS is used in audiobooks, podcasts, and video games for voiceovers and narration.
In this project, we’ll build a TTS engine using an ESP32. The system will read a number the user enters on a webpage hosted by the ESP32. The ESP32 will then read the number aloud through a connected speaker.
Components
- ESP32 x1
- TDA2030A audio amplifier module x1
- 8Ohm speaker x1
- Connecting or DuPont wires
Circuit connections
To build this project, connect:
- ESP32’s GPIO25 pin to the amplifier module’s Audio In + signal
- The ground (GND) to the amplifier module’s Audio In – signal
- The amplifier modules’ VCC and GND pins to ESP32’s 3V3 out and GND pins, respectively
Next, attach a 4Ω/8Ω speaker to the amplifier module’s speaker output connector. Finally, use a MicroUSB cable to connect ESP32 to upload the sketch, and power the board using a USB.
The circuit diagram for the device is as follows…
Arduino sketch
Now, connect ESP32 with your computer. Write and upload the following sketch on Arduino IDE.
How it works
ESP32 is programmed as a web server that hosts a webpage where users can input a number. It receives the number over the Internet and converts it to speech using the Talkie library. The generated speech is then played through a speaker connected to ESP32.
The code
For this project, you’ll need the Talkie library. Install the latest version of the Talkie library from the Library Manager in the Arduino IDE.
To summarize, the code creates a simple web server that performs text-to-speech (TTS) for numbers. It uses an ESP32 microcontroller, the Talkie library for speech synthesis, and a web interface for input.
This section includes the required libraries for the WiFi, including the Talkie library, and web server functionality. It also declares variables for WiFi credentials, the web server object, and the htmlResponse buffer.
It also defines the voice object of the Talkie class, which will be used to play the speech samples. The spZERO to spTHOUSAND, and other sp… arrays hold the audio data for each word, stored in program memory (flash) using the PROGMEM keyword to conserve valuable RAM.
This user-defined handleRoot() function serves the HTML page to the client. The HTML includes a text input field (msg) for the user to enter a number and a “Speak” button. JavaScript code is embedded using “jQuery” to handle the button click.
When the button is clicked, it retrieves the number from the input field and sends a GET request to the /speak endpoint on ESP32’s web server.
The user-defined textToSpeech() function is called when the server receives a request at the /speak endpoint. It retrieves the number from the request’s query parameter (Number), converts it to an integer, and then calls the speakOutNumber() function to handle the actual speech synthesis.
This user-defined speakOutNumber() function covers the core logic of the TTS. It recursively breaks down the number into thousands, hundreds, tens, and units, using the Talkie library’s say() function to play the corresponding speech samples. It also handles negative numbers, zero, and numbers up to 999,999.
The logic is a bit complex due to the way English language expresses numbers (e.g., “eleven” instead of “ten one”). So, it uses switch statements and if conditions to handle these special cases.
The setup() function initializes ESP32. It sets up the serial communication, connects to the WiFi network, and starts the web server. It also sets pin 25 high, at least initially, which is likely related to the Talkie library’s hardware setup.
The loop() function continuously listens for incoming client requests and processes them using the web server object.
Results
Video
You may also like:
Filed Under: Electronic Projects
Questions related to this article?
👉Ask and discuss on EDAboard.com and Electro-Tech-Online.com forums.
Tell Us What You Think!!
You must be logged in to post a comment.