Note: this file originally appeared on the Newton Underground site (http://www.newton-underground.com/dev/a0000003.shtml, later moved to http://resources.pdadash.com/newtund/NU/dev/). Since it no longer seems to be available, I have uploaded it here, and commented out or modified links as appropriate. Steve
Article 00003.1
How to work with the Text-to-Speech extension.
Contributed by William Nelson <will@newton-underground.com>
and Jake Bordens <jake@newton-underground.com>
Note: This
article is presented for informational purposes only. We cannot provide
you the TTS extensions themselves, nor can I tell you where to find them.
Hopefully, Apple/Newton will realize their potential, and release them publicly.
Working with the Text-to-Speech extensions for OS 2.1 Newtons is not difficult - currently, the most difficult aspect of integrating TTS support into your applications is locating the extensions themselves, which are not publicly available. They are, however, fairly widely dispersed amongst the user community, and most users who really want them have found them.
The pre-beta version of TTS that has been circulating consists of two autoparts: Macintalk, which is the actual speech codec, and SpeakText:Newton, which installs as a transport and routes text to Macintalk for speaking with a given set of preferences for voice type, rate, pitch, and so on. However, SpeakText is unnecessary for TTS functionality; it's nothing more than a nice global hook for routing text to Macintalk. When incorporating TTS support into your applications, you'll want to send text to Macintalk directly, with the appropriate control codes.
With Macintalk installed, playSound(textstring) will produce spoken results. So for instance,
playSound("Hello, this is Newton");
someText:="12:00";
playSound(someText);
playSound("It is now"&someText&"o'clock");
|
will all result in spoken output. Note that raw Macintalk speaking of this sort is quite low in volume, so you'll want to increase the volume, preferably by using the delimited volume command (see Jim Bailey's article, More Text to Speech), or by bracketing any spoken text with a routine that raises system volume to the maximum and then reduces it to the user preference, e.g. |
thevolume:=getvolume(); Setvolume(4); playSound(yourtext); Setvolume(thevolume); |
The default voice type is "Fred", and the rate and pitch controls default at a middle range. However, you can easily produce speech in any of the 9 available voices, and in a great range of pitches and speaking rates. The basic method for control over these options is to embed in the text string specific control codes that Macintalk will parse and respond to appropriately. Any text between double brackets -- [[any text]] -- that is sent to Macintalk via playSound as part of a text string will not be spoken, but rather parsed by Macintalk for control codes. This is true even if the bracketed text is in an invalid format. Jim Bailey's More Text to Speech has a full glossary of controls, so you'll want to consult that for an in-depth discussion. But as an example [[svox xxxx]] will cause text to be spoken in voice xxxx, where xxxx is one of the following nine voices: fred - the default; sort of like kermit the frog So, for example, |
playSound("[[svox zarv]] Hello, this is Newton");
|
will speak that sentence in Zarvox's voice. Putting it all together Two things to note about the control codes are that 1) they may be placed anywhere within the text string to be spoken; and 2) multiple control codes are possible. So for instance, this text string will be spoken as intended: |
playSound("[[svox gnws]][[pbas -10]][[rate +200]]
Hello, this is [[svox zarv]][[rate - 500]] Newton");
|
Sample Routine: The following is a sample NS snippet that I wrote for use in GestureLaunch to speak user-hilighted text in the voice type of their preference, (parm, representing a four letter text string -- zarv, kath, etc). It could easily be keyed to an on-screen button, with similar preferences set for rate and pitch. |
begin
local hilitedText;
local hiliteOffsets;
local thevolume;
local voxchoice;
thevolume:=getvolume(); // Get current volume settings
voxchoice:=clone(parm); // Get the user's choice of voice type
hiliteOffsets:=gethiliteoffsets(); // Get the hilited section
If not classof (hiliteOffsets) = 'array or length(hiliteOffsets) <0
then return;
hilitedText:=substr(hiliteOffsets [0] [0].text, hiliteOffsets [0] [1],
hiliteOffsets [0][2] - hiliteOffsets [0][1]; //Strip out the text from the hilites
try
Setvolume(4); //Set volume to max
playSound("[[svox "&voxchoice&"]] "&hilitedText); //Append the control
code to the text string to be spoken
Setvolume(thevolume); // Return volume to user setting
end
|
| Advanced topic: Multiple Sound Channels The MP2x00 can create and play up to four sound channels simultaneously (and maybe more?). You can take advantage of this to have Macintalk speak the same or different text in different voices simultaneously -- e.g., to have up to four voices speaking at once. What follows is an exported WinNTK platform file of Bottles2, which features Ralph, Princess, and Zarvox singing a drunken chanty. Jake Bordens, who wrote the "engine" for Bottles2, has the Newton singing "Row, Row, Row Your Boat" in rounds. Bottles2 is available for download as well. |
// Text of project C:\bottles\bottles2.ntk written on: 01/05/98 15:33:24
// Beginning of text file definitions.txt
DefConst('kSongText, "[[nmbr norm]][[pmod 1]]]99[[pbas -8]] bottles of
[[pbas +8]]beer[[pbas -8]] on the [[rate 0]][[pbas +8]]wall [[rate 100]]
[[pbas +3]]99[[pbas -8]]bottles of [[rate 0]][[pbas +8]]beer[[pbas -5]]
[[rate 200]] you take one [[rate 0]]down[[rate 200]] pass it around [[slnc 200]]
[[pbas - 8]]98[[pbas +3]] bottles of [[pbas +2]]beer on the [[pbas +2]]wall");
DefConst('kCloseChannelsFunc,
func(channel)
begin
print("Checking to see if we can close the channels");
if (channel[2] = nil) then return;
if (channel[2]:isActive() = nil) then
begin
//close the channels
print("closing the channels");
channel[0]:Close();
channel[1]:Close();
channel[2]:Close();
end;
else
AddDeferredCall(channel[3], [channel]);
end
);
// End of text file C:\rowing\definitions.txt
// Beginning of file base.lyt
rowingBaseView :=
{viewBounds: {left: -3, top: 20, bottom: 254, right: 151},
channel1: nil,
channel2: nil,
channel3: nil,
debug: "rowingBaseView",
_proto: @179
};
_view000 := /* child of rowingBaseView */ {_proto: @166};
_view001 := /* child of rowingBaseView */
{
buttonClickScript:
func()
begin
//initialize the sound channels
print("Initializing the sound channels");
spCh1 := {_proto: @431};
spCh2 := {_proto: @431};
spCh3 := {_proto: @431};
//open the sound channels
print("Opening the sound channels");
spCh1:Open();
spCh2:Open();
spCh3:Open();
//sechedule the text to speak
print("Scheduling the text to speak");
spCh1:Schedule("[[rset 0]][[svox ralf]][[pbas 60]][[slnc 10]]" & kSongText);
spCh2:Schedule("[[rset 0]][[svox zarv]][[pbas 45]]" & kSongText);
spCh3:Schedule("[[rset 0]][[svox prin]][[pbas 45]]" & kSongText);
//start the song
print("starting the song");
spCh1:start(true);
spCh2:start(true);
spCh3:start(true);
//add a deferred call to close the sound channels
print("registering the deferred call");
AddDeferredCall(kCloseChannelsFunc, [[spCh1, spCh2, spCh3, kCloseChannelsFunc]]);
end,
text: "Sing?",
viewBounds: {left: 14, top: 194, right: 140, bottom: 210},
_proto: @226
};
_view002 := /* child of rowingBaseView */
{
text:
"2 Drunks & a Robot!! --
The following initalizes three sound channels and plays a voice track in each.
Be warned, use this at your own risk. Now...
without further delay....",
viewBounds: {left: 12, top: 12, right: 142, bottom: 182},
viewJustify: 0,
viewFont: ROM_fontSystem10,
_proto: @218
};
constant |layout_base.lyt| := rowingBaseView;
// End of file base.lyt
|