ONJava.com -- The Independent Source for Enterprise Java
oreilly.comSafari Books Online.Conferences.

advertisement

AddThis Social Bookmark Button

The Java Speech API, Part 1
Pages: 1, 2, 3

The menu items for the speech menu to enable user actions for invoking speech capabilities (speaking the contents of the text editor, pausing and resuming the speech synthesis operations, and canceling a speak operation in progress) will be created using the getPlayMenuItem(), getPauseMenuItem(), getResumeMenuItem(), and getStopMenuItem() methods. The action listeners for the menu items of the speech menu will contain the JSAPI-specific code for speech synthesis actions.

Let's see the action listeners for each of the menu items of the speech menu. The action listener for the “Play” menu item contains the call to the speakPlainText() method of the Synthesizer object. This invokes the speech synthesis capability provided by the FreeTTS speech synthesis engine. The contents of the text file opened in the text editor will be read out by the speech synthesis engine.

private JMenuItem getPlayMenuItem()
{
    if (playMenuItem == null)
    {
        playMenuItem = new JMenuItem("Play");
        myActionListener =
            new ActionListener()
            {
                public void actionPerformed(ActionEvent ae)
                {
                    String textToPlay = "";

                    try
                    {
                        // retrieve the text to be played
                        if (textArea.getSelectedText() != null)
                            textToPlay = textArea.getSelectedText();
                        else
                            textToPlay = textArea.getText();

                        // play the text
                        synthesizer.speakPlainText(textToPlay, null);

                        // wait till speaking is done
                        synthesizer.waitEngineState(Synthesizer.QUEUE_EMPTY);

                        System.out.println("NOVUSWARE : "
                                         + "Play menu item action performed.");
                    }
                    catch(Exception e)
                    {
                        e.printStackTrace();
                        System.out.println("NOVUSWARE : ERROR! "
                                         + "Play menu item action." + e);
                    }
                }
            };
        playMenuItem.setMnemonic('p');
        playMenuItem.addActionListener(myActionListener);

        System.out.println("NOVUSWARE : Play menu item created.");
    }
    return playMenuItem;
}

The action listener for the “Pause” menu item contains the call to the pause() method, inherited from the Engine object. This makes the FreeTTS speech synthesis engine halt speech synthesis processing.

private JMenuItem getPauseMenuItem()
{
    if (pauseMenuItem == null)
    {
        pauseMenuItem = new JMenuItem("Pause");
        myActionListener =
            new ActionListener()
            {
                public void actionPerformed(ActionEvent ae)
                {
                    try
                    {
                        // pause the speech synthesizer
                        synthesizer.pause();
                        System.out.println("NOVUSWARE : "
                                         + "Pause menu item action performed.");
                    }
                    catch(Exception e)
                    {
                        e.printStackTrace();
                        System.out.println("NOVUSWARE : ERROR! "
                                         + "Pause menu item action." + e);
                    }
                }
            };
        pauseMenuItem.setMnemonic('a');
        pauseMenuItem.addActionListener(myActionListener);

        System.out.println("NOVUSWARE : Pause menu item created.");
    }
    return pauseMenuItem;
}

The action listener for the “Resume” menu item contains the call to the resume() method of the Synthesizer object, inherited from the Engine object. This makes the FreeTTS speech synthesis engine continue the speech synthesis processing from the last position in the text, where a pause had occurred.

private JMenuItem getResumeMenuItem()
{
    if (resumeMenuItem == null)
    {
        resumeMenuItem = new JMenuItem("Resume");
        myActionListener =
            new ActionListener()
            {
                public void actionPerformed(ActionEvent ae)
                {
                    try
                    {
                        // resume the speech synthesizer
                        synthesizer.resume();
                    }
                    catch(Exception e)
                    {
                        e.printStackTrace();
                        System.out.println("NOVUSWARE : ERROR! "
                                         + "Resume menu item action." + e);
                    }

                    System.out.println("NOVUSWARE : "
                                     + "Resume menu item action performed.");
                }
            };
        resumeMenuItem.setMnemonic('r');
        resumeMenuItem.addActionListener(myActionListener);

        System.out.println("NOVUSWARE : Resume menu item created.");
    }
    return resumeMenuItem;
}

The action listener for the “Stop” menu item contains the call to the cancel() method of the Synthesizer object. This halts any speech synthesis processing currently underway by the FreeTTS speech synthesis engine.

private JMenuItem getStopMenuItem()
{
    if (stopMenuItem == null)
    {
        stopMenuItem = new JMenuItem("Stop");
        myActionListener =
            new ActionListener()
            {
                public void actionPerformed(ActionEvent ae)
                {
                    try
                    {
                        synthesizer.cancel();

                        System.out.println("NOVUSWARE : "
                                         + "Stop menu item action performed.");
                    }
                    catch(Exception e)
                    {
                        e.printStackTrace();
                        System.out.println("NOVUSWARE : ERROR! "
                                         + "Stop menu item action." + e);
                    }
                }
            };
        stopMenuItem.setMnemonic('t');
        stopMenuItem.addActionListener(myActionListener);

        System.out.println("NOVUSWARE : Stop menu item created.");
    }
    return stopMenuItem;
}

Finally, the VoicePad class contains the main() method, which will instantiate an object of the VoicePad class and invoke the setVisible(true) method to display the text editor application.

// execute voicepad application
public static void main(String argv[])
{
    VoicePad voicePad = new VoicePad();
    voicePad.setVisible(true);
}

The entire source code of the VoicePad application can be viewed here.

Compiling and Running the VoicePad Application

We need to have a few software components installed and properly configured before we compile and execute the VoicePad application.

FreeTTS Speech Synthesis System

We will use the FreeTTS 1.1.2 JSAPI-compliant speech synthesis engine for our VoicePad application. The FreeTTS engine can be downloaded from FreeTTS on SourceForge.net.

Follow the instructions to install and configure the speech synthesis engine. Verify that the sample “HelloWorld” program runs with the FreeTTS engine before proceeding to compile the VoicePad application.

If you encounter any problems (normally related to either CLASSPATH or FreeTTS-speech-engine configuration), please refer to the FreeTTS manual and troubleshooting guide. Once the test applications work fine with the FreeTTS speech engine, you can proceed to compile and run the VoicePad application.

In case you wish to use another speech synthesis engine, then you need to install and configure it and ensure that it runs properly before you use it in the VoicePad application. Apart from this, you need to change the environment-setting script to ensure that the necessary library files are used.

JDK 1.4

Download and install JDK 1.4. JDK 1.4 is preferred because of advantages such as improved IO. The FreeTTS speech synthesis system also lists use of the JDK 1.4.

Edit the setEnv.bat script to define your environment-specific JAVA_HOME and SPEECH_SYNTHESIS_HOME directories. Executing this script on the console will set up the PATH, CLASSPATH, and other required environment settings. Run the command:

$ java -version

to verify that the JDK 1.4 is being used.

Compile the VoicePad application using the JDK 1.4 compiler using the command:

$ javac -d . VoicePad.java

If you encounter any PATH or CLASSPATH problems, check the setEnv.bat script for incorrect settings. After the source code compiles without any problems, run the VoicePad text editor using the command:

$ java com.novusware.speech.example.VoicePad

The VoicePad application screen
Figure 6. The VoicePad application screen

You should see the application screen come up, as shown in Figure 6. Now we can verify that both text editing and speech capabilities work properly.

First try the menu options of the File menu. Open an existing text file or create a new file and save the changes to the file. Take a look at the console from which the VoicePad application was executed. You should see the execution trace of the methods that were executed to give an idea of the processing sequence of the VoicePad application.

Now you are ready to hear the application speak to you. Open a file in the text editor. Select the “Play” option from the “Speech” menu. The VoicePad application will process the contents of the file opened in the editor and speak the contents using the FreeTTS speech synthesis engine. Next, select the “Pause” and “Resume” menu options to simulate pausing and resuming the speech output.

This concludes our initial exploration of the Java Speech API for speech synthesis.

Summary

The Java Speech API provides a simple and elegant way to integrate speech capability in Java applications. The VoicePad speech-enabled text editor gave an idea of how easy it is to integrate speech synthesis capability in Java applications. In the next article, we will discuss the speech recognition support in the Java Speech API and discuss the application areas for integrating speech capability using the Java Speech API. Other topics that will be covered in the next article are the support APIs -- Java Speech API Markup Language (JSML) and the Java Speech API Grammar Format Specification (JSGF).

Resources

Mandar S. Chitnis is a co-founder of Novusware inc.

Lakshmi Ananthamurthy is a co-founder of Novusware inc.


Return to ONJava.com.