Note: this file originally appeared on the Newton Underground site (http://www.newton-underground.com/dev/a0000003.shtml, later moved to http://resources.pdadash.com/newtund/NU/dev/). Since it no longer seems to be available, I have uploaded it here, and commented out or modified links as appropriate. Steve
Article 00003.1
How to work with the Text-to-Speech extension.
Contributed by William Nelson <will@newton-underground.com>
and Jake Bordens <jake@newton-underground.com>
Note: This
article is presented for informational purposes only. We cannot provide
you the TTS extensions themselves, nor can I tell you where to find them.
Hopefully, Apple/Newton will realize their potential, and release them publicly.
Working with the Text-to-Speech extensions for OS 2.1 Newtons is not difficult - currently, the most difficult aspect of integrating TTS support into your applications is locating the extensions themselves, which are not publicly available. They are, however, fairly widely dispersed amongst the user community, and most users who really want them have found them.
The pre-beta version of TTS that has been circulating consists of two autoparts: Macintalk, which is the actual speech codec, and SpeakText:Newton, which installs as a transport and routes text to Macintalk for speaking with a given set of preferences for voice type, rate, pitch, and so on. However, SpeakText is unnecessary for TTS functionality; it's nothing more than a nice global hook for routing text to Macintalk. When incorporating TTS support into your applications, you'll want to send text to Macintalk directly, with the appropriate control codes.
With Macintalk installed, playSound(textstring) will produce spoken results. So for instance,
playSound("Hello, this is Newton"); someText:="12:00"; playSound(someText); playSound("It is now"&someText&"o'clock"); |
will all result in spoken output. Note that raw Macintalk speaking of this sort is quite low in volume, so you'll want to increase the volume, preferably by using the delimited volume command (see Jim Bailey's article, More Text to Speech), or by bracketing any spoken text with a routine that raises system volume to the maximum and then reduces it to the user preference, e.g. |
thevolume:=getvolume(); Setvolume(4); playSound(yourtext); Setvolume(thevolume); |
The default voice type is "Fred", and the rate and pitch controls default at a middle range. However, you can easily produce speech in any of the 9 available voices, and in a great range of pitches and speaking rates. The basic method for control over these options is to embed in the text string specific control codes that Macintalk will parse and respond to appropriately. Any text between double brackets -- [[any text]] -- that is sent to Macintalk via playSound as part of a text string will not be spoken, but rather parsed by Macintalk for control codes. This is true even if the bracketed text is in an invalid format. Jim Bailey's More Text to Speech has a full glossary of controls, so you'll want to consult that for an in-depth discussion. But as an example [[svox xxxx]] will cause text to be spoken in voice xxxx, where xxxx is one of the following nine voices: fred - the default; sort of like kermit the frog So, for example, |
playSound("[[svox zarv]] Hello, this is Newton"); |
will speak that sentence in Zarvox's voice. Putting it all together Two things to note about the control codes are that 1) they may be placed anywhere within the text string to be spoken; and 2) multiple control codes are possible. So for instance, this text string will be spoken as intended: |
playSound("[[svox gnws]][[pbas -10]][[rate +200]] Hello, this is [[svox zarv]][[rate - 500]] Newton"); |
Sample Routine: The following is a sample NS snippet that I wrote for use in GestureLaunch to speak user-hilighted text in the voice type of their preference, (parm, representing a four letter text string -- zarv, kath, etc). It could easily be keyed to an on-screen button, with similar preferences set for rate and pitch. |
begin local hilitedText; local hiliteOffsets; local thevolume; local voxchoice; thevolume:=getvolume(); // Get current volume settings voxchoice:=clone(parm); // Get the user's choice of voice type hiliteOffsets:=gethiliteoffsets(); // Get the hilited section If not classof (hiliteOffsets) = 'array or length(hiliteOffsets) <0 then return; hilitedText:=substr(hiliteOffsets [0] [0].text, hiliteOffsets [0] [1], hiliteOffsets [0][2] - hiliteOffsets [0][1]; //Strip out the text from the hilites try Setvolume(4); //Set volume to max playSound("[[svox "&voxchoice&"]] "&hilitedText); //Append the control code to the text string to be spoken Setvolume(thevolume); // Return volume to user setting end |
Advanced topic: Multiple Sound Channels The MP2x00 can create and play up to four sound channels simultaneously (and maybe more?). You can take advantage of this to have Macintalk speak the same or different text in different voices simultaneously -- e.g., to have up to four voices speaking at once. |
// Text of project C:\bottles\bottles2.ntk written on: 01/05/98 15:33:24 // Beginning of text file definitions.txt DefConst('kSongText, "[[nmbr norm]][[pmod 1]]]99[[pbas -8]] bottles of [[pbas +8]]beer[[pbas -8]] on the [[rate 0]][[pbas +8]]wall [[rate 100]] [[pbas +3]]99[[pbas -8]]bottles of [[rate 0]][[pbas +8]]beer[[pbas -5]] [[rate 200]] you take one [[rate 0]]down[[rate 200]] pass it around [[slnc 200]] [[pbas - 8]]98[[pbas +3]] bottles of [[pbas +2]]beer on the [[pbas +2]]wall"); DefConst('kCloseChannelsFunc, func(channel) begin print("Checking to see if we can close the channels"); if (channel[2] = nil) then return; if (channel[2]:isActive() = nil) then begin //close the channels print("closing the channels"); channel[0]:Close(); channel[1]:Close(); channel[2]:Close(); end; else AddDeferredCall(channel[3], [channel]); end ); // End of text file C:\rowing\definitions.txt // Beginning of file base.lyt rowingBaseView := {viewBounds: {left: -3, top: 20, bottom: 254, right: 151}, channel1: nil, channel2: nil, channel3: nil, debug: "rowingBaseView", _proto: @179 }; _view000 := /* child of rowingBaseView */ {_proto: @166}; _view001 := /* child of rowingBaseView */ { buttonClickScript: func() begin //initialize the sound channels print("Initializing the sound channels"); spCh1 := {_proto: @431}; spCh2 := {_proto: @431}; spCh3 := {_proto: @431}; //open the sound channels print("Opening the sound channels"); spCh1:Open(); spCh2:Open(); spCh3:Open(); //sechedule the text to speak print("Scheduling the text to speak"); spCh1:Schedule("[[rset 0]][[svox ralf]][[pbas 60]][[slnc 10]]" & kSongText); spCh2:Schedule("[[rset 0]][[svox zarv]][[pbas 45]]" & kSongText); spCh3:Schedule("[[rset 0]][[svox prin]][[pbas 45]]" & kSongText); //start the song print("starting the song"); spCh1:start(true); spCh2:start(true); spCh3:start(true); //add a deferred call to close the sound channels print("registering the deferred call"); AddDeferredCall(kCloseChannelsFunc, [[spCh1, spCh2, spCh3, kCloseChannelsFunc]]); end, text: "Sing?", viewBounds: {left: 14, top: 194, right: 140, bottom: 210}, _proto: @226 }; _view002 := /* child of rowingBaseView */ { text: "2 Drunks & a Robot!! -- The following initalizes three sound channels and plays a voice track in each. Be warned, use this at your own risk. Now... without further delay....", viewBounds: {left: 12, top: 12, right: 142, bottom: 182}, viewJustify: 0, viewFont: ROM_fontSystem10, _proto: @218 }; constant |layout_base.lyt| := rowingBaseView; // End of file base.lyt |