Specifying pronunciations with user dictionaries
You can create user dictionaries to tune or alter the default pronunciation of words spoken by Vocalizer. For example, you can define the pronunciation of words from foreign languages, define the expansion of a special acronym, or tune the pronunciation of a word that has an unusual spelling.
Process for user dictionaries:
- Define combinations of source and destination strings in a dictionary file. (When the source appears in input text, the TTS conversion uses the destination pronunciation.)
- Convert the dictionary file to binary format using Nuance Vocalizer Studio or the conversion tool (dictcpl).
- Load the binary file into the Vocalizer runtime environment.
Vocalizer consults the user dictionary for each individual word in the input text, including multi-word fragments, to check whether to replace the input text with a destination string from the dictionary.

When loading a dictionary instance, Vocalizer consults each individual word and multi-word fragment in the input text, and determines whether a substitution is needed from the dictionary. It performs the following checks in order, and stops if a candidate if found:
- Look for the string “as is” (with no modification).
- Look for the string after removing leading and trailing quotes and brackets.
- Look for the string with trailing dots removed.
- Look for the lowercase form of the string.
Rules for matching the source string:
- If the source string has one or more uppercase letters, the match is case-sensitive. For example, when the dictionary entry is "DLL", only the text input key "DLL" will match that entry.
- If the source string has no uppercase letters, the match is not case-sensitive. For example, when the dictionary entry is "dll", text input keys such as "dll", "Dll", and "dLL" will match.
Note: It is possible for two separate dictionary entries to have source strings that differ only in casing. If there is a separate source string with uppercase letters that also matches the input, this uppercase string will take precedence.
Precedence rules for dictionary substitution:
- When the same source string occurs more than once in the same subheader, the last occurrence determines the destination string.
- When the same source string occurs in different subheaders with different content type (one phonetic and one orthographic), the occurrence in the first subheader determines the destination string.
- Only complete words are matched. If the source string in the dictionary is a substring of a word in the input text, it will not be substituted.

Following is an example input file for a dictionary.
Note: User dictionaries use 3-character Vocalizer language codes.
[Header]
Language = ENG
[SubHeader]
Content = EDCT_CONTENT_BROAD_NARROWS
Representation = EDCT_REPR_SZZ_STRING
[Data]
zero // #'zi.R+o&U#
addr // #'@.dR+Es#
adm // #@d.'2mI.n$.'stR+e&I.S$n#
[SubHeader]
Content=EDCT_CONTENT_ORTHOGRAPHIC
Representation=EDCT_REPR_SZ_STRING
[Data]
Info Information
IT "Information Technology"
DLL "Dynamic Link Library"
A-level "advanced level"
Afr africa
Acc account
TEL telephone
Anon anonymous
AP "associated press"

Textual dictionaries must only be encoded in UTF-8. Note that they may not contain the 3-byte UTF-8 preamble, also known as the UTF-8 BOM or signature.
The general format of textual dictionaries consists of one [Header] label and its properties, and several [SubHeader]-[Data] label couples with their properties and data. Each [SubHeader] describes the expected data properties (such as orthographic or phonetic text) while [Data] describes the actual source string that needs to be replaced with a destination string.
You can represent the destination string of a dictionary entry using orthographic or phonetic text. For phonetic strings, you must use the L&H+ phonetic alphabet. For information about phonemes, see your Language Supplement.
The simplest dictionary consists of one [Header] label and one [Data] label; but while it’s syntactically correct, such a dictionary doesn’t specify any actions.
Here is an example of the format:
[Header]
Language = language_code
[SubHeader]
Content=content_type
Representation=representation_type
Language = language_code
[Data]
source_stringseparatordestination_string
Item |
Description |
---|---|
language_code
|
Three-letter code used to identify the language; for example, ENU for American English. The language code is mandatory; it must be specified either in the header, or in each sub-header. Only one language may be used in each dictionary. |
content_type
|
Type of content checked against the dictionary. There are two options:
The content type determines the representation type. You must specify the content type in each sub-header in the dictionary. |
representation_type
|
Representation type used for the output:
You must specify the representation type for each sub-header in the dictionary. |
source_string
|
Source string that is to be replaced. If the string has multiple words, enclose them in double quotes ("). Optional. To add whitespace characters to a multi-word phrase, use the <ESC>\mw\ control sequence. (This is not required. This syntax is kept for compatibility with previous releases.) |
separator
|
Separator between the source string and the destination string. This separator must be a tab character. |
destination_string
|
One or more words to be used to replace the source string. If the string consists of phonetic symbols, precede with with two forward slashes (//). If the string has multiple words, enclose them in double quotes ("). |
Each dictionary can include several sub-header sections; each sub-header can include several data sections; and each data section can include several different source/destination string pairings. Each source/destination string pair must appear on a separate line within the data section.
Here is an example of a short dictionary:
[Header]
Language = ENU
[SubHeader]
Content=EDCT_CONTENT_ORTHOGRAPHIC
Representation=EDCT_REPR_SZ_STRING
[Data]
DLLDynamic Link Library
HelloWelcome to the demonstration of the American English Text-to-Speech system.
infoInformation
[SubHeader]
Content = EDCT_CONTENT_BROAD_NARROWS
Representation = EDCT_REPR_SZZ_STRING
[Data]
addr // '@.dR+Es

If you experience an error, it may be one of the following:
- Text dictionaries must only be encoded in UTF-8. Note that all characters in the 7-bit US-ASCII range (hex 20 to 7f) are encoded the same way in UTF-8, US-ASCII, Windows-1252, ISO-8859-1, and other formats. So dictionaries which only use character codes in the ASCII range can be encoded in (for example) Windows-1252.
If a non US-ASCII character is present (for example, ä) and the encoding used is (for example) Windows-1252, then an error is returned when the dictionary is compiled. Similarly, when the dictionary file is opened in Nuance Vocalizer Studio (see below), a fatal error is displayed.
- When the content type is EDCT_CONTENT_ORTHOGRAPHIC, the destination strings for this subheader must consist only of orthographic characters. A phonetic string is interpreted as an orthographic string, and no error is returned.
- When the content type is EDCT_CONTENT_BROAD_NARROWS, the destination strings expected for this subheader must consist only of phonetic characters; an error is returned for any destination string that isn't preceded by two forward slashes (//).
- When unknown symbols are used in phonetic content, they are ignored.
- Only one language can be specified. If more than one language is specified, no error is returned, but the dictionary is ignored.
- The specified language has to be installed. If the language is not installed, no error is returned, but the dictionary is ignored.

The conversion tool (dictcpl) can convert a text format dictionary to a binary dictionary. The tool is a console program, so you can open a DOS prompt to run it:
dictcpl -o binary-dictionary text-dictionary
Options:
Option |
Description |
---|---|
-o |
Binary dictionary (output) |
For example:
> dictcpl -o userdct_enu.bdc enu.txt

Use these items to load user dictionaries at runtime:
- SSML <lexicon> element
- The <default_dictionaries> XML configuration file parameter.
You can load any number of user dictionaries at runtime. The load order determines the precedence, with more recently loaded dictionaries having precedence over previously loaded dictionaries.
The runtime consults only user dictionaries whose language matches the current synthesis language.

You cannot call dictionary functions on a TTS engine instance that is in the state of processing.