2008. július 26., szombat

Speech Part 1 - How to Add "Text to Speech" (Speech Synthesis) to your Delphi Apps


How can I get my application to read text?


On Aug 11, 2001 Microsoft released the SAPI 5.1 SDK. This is significant because SAPI 5.1  is fully automated. That is you can use it from any language that supports OLE automation. These are not Active X controls and can be either early or late bound.

In this article I’m going to show you how to get and install the SAPI 5.1 SDK. Then I’m going to show how to use the SDK convert text to synthesized speech in a Delphi application. The synthesized speech is played over you computers speakers. I test this in Delphi 5 and 6.

To get SAPI 5.1 you need to go to Microsoft’s Speech.net Technologies web site at


and follow the link to the download. Right next to the download link is the release notes link. READ THE RELEASE NOTE! Especially if your development machine is using a default language other than US English.

If you are running a beta version of the XP operating system you might have some problems. This is because SAPI 5.1 is built into XP and the most recent public beta of XP as of this writing (RC 2) includes an earlier version of SAPI 5.1. Don’t try to install the release version of SAPI 5.1 into XP, it will not work.

Once you read the release notes follow the link to the Speech SDK 5.1 Download page. In most cases all you need to download is the link labeled “Speech SDK 5.1 (68 MB). This contains the SDK, the documentation and the free Microsoft English text to speech and speech recognition engines. The download is very large, 68 MB, so unless you have a high speed connection to the internet you might want to order the SDK CD from Microsoft.

…. Time passes while you download or wait for the postman ….

Ok, now you have the SAPI 5.1 SDK. Run the speechsdk51.exe to install it on your development system.

There is a bug in the type library import in Delphi 6 see article "Delphi 6 - Imported Automation Events Bug". This sample will still work with the unit created by the type libary import in Delphi 6 but only because none of the events for the component are used. If you want to use any of the SPVoice events you will need to read article "Delphi 6 - Imported Automation Events Bug".

What you need to do now is make Delphi aware of the new SAPI automation objects. To do this, start up Delphi 5 or 6 (I didn’t try earlier versions) and go to Project | Import Type Library. In the Import Type Library dialog highlight “Microsoft Speech Object Library (Version 5.1)”. If you don’t find this in the list then something’s wrong with the installation of SAPI 5.1.

Delphi is going to want to put the SAPI components on your ActiveX palette page. I recommend you put these on a new palette page called &#8220;SAPI 5&#8221; since the number of components installed is large (19). You may also want to choose a &#8220;Unit dir name&#8221; of something other than the default. Make sure the &#8220;Generate Component Wrapper&#8221; check box is checked and press the >Install< button.

In the Install dialog choose the &#8220;Into new package&#8221; tab and in the &#8220;File name:&#8221; field give a package name like &#8220;SAPI5.dpk&#8221; press the browse button and make sure the dpk is created in the same directory where you created the components. Actually this isn&#8217;t completely necessary it just helps keep things together. In the Install dialog&#8217;s Description field give some meaningful description like &#8220;SAPI 5 automation components&#8221;. Press OK

Press yes in the confirm dialog and the new components will be created and installed.

If you now look in the directory you specified for the components you should find SpeechLib_TLB.pas (and dcr) which contains all the component code as well as interface, const, type and other useful information. This is your most valuable piece of documentation on the SDK. I&#8217;ve found it even better than the Microsoft SAPI 5.1 documentation which is pretty good. This directory should also contain (if you followed the above instructions) the SAPI5.dpk which is your package source.

If you go to the far eastern end of your component palette you should find the new SAPI5 palette page with its 19 speech components.

Now for the fun part.

Let&#8217;s make an application that can synthesize speech. In Delphi start a new application and drop a button on the form. On the SAPI5 palette page find the SpVoice component and drop it on the form. On my machine this component is the 5th one reading from left to right.

Now create an onClick event for you button that looks something like this;

procedure TForm1.Button1Click(Sender: TObject);
  SpVoice1.Speak('Hello world!', SVSFDefault);

Run the program and press the button. Cool hu?

At this level it&#8217;s amazingly simple. The SPVoice objects Speak method is very powerful. This power comes from the second parameter. For the above example I choose to use the default mode which causes the speak method to return only when the synthesis is complete, not to purge pending speech requests, to respond to special XML control tags embedded in the text.

The SDKs documentation is contained in sapi.chm which you will find in the  \Program Files\Microsoft Speech SDK 5.1\Docs\Help directory.

Sapi.chm contains a lot of information. To go directly to the meat of the subject go to the last folder on the outlines 1st level titled Automation and go down to SPVoice and then to the Speak method read what&#8217;s there and also be sure to follow the link to the SpeechVoiceSpeakFlags info. You will find that in addition to just speaking passed in text that can also do much more some of the more interesting flags are;

Pass in a file name and speak the text in the file. (SVSFIsFilename)
Make the function either return immediately (asynchronously) or only after the synthesis is complete(synchronously). If you speak asynchronously there are events available to fire when the speech is done. (SVSFlagsAsync)
Embed flags in the text that can control various aspects of the synthesis like pitch, rate, emphasis, and much more (see the included White Paper titled &#8220;XML TTS Tutorial&#8221;). I found this feature a bit addicting as I attempted to make the synthesized voice sing.( SVSFIsXML)

One interesting thing I found (but not documented) was that you can speak a web sites title by setting the flag to SVSFIsFilenam and passing a URL. If you are connected to the internet, try replacing the speak line in the sample line with

SpVoice1.Speak('http://www.o2a.com', SVSFIsFilename);

And run it.

Even more bizarre is you can use the speak method to play wav files. Try

SpVoice1.Speak('C:\WINNT\MEDIA\Windows Logon Sound.wav', SVSFIsFilename);

There&#8217;s a lot more to SAPI then text to speech and there&#8217;s more to text to speech then what I&#8217;ve covered here. Hopefully this will be the first of a number of articles on SAPI but I&#8217;ll only do them if you&#8217;re interested so please be sure to comment. Also I&#8217;m completely open to suggestions on what you&#8217;d like to see next (if anything at all).

If you want to talk privately I&#8217;m at alecb@o2a.com.

Nincsenek megjegyzések:

Megjegyzés küldése