Specifying pronunciations with user dictionaries

Dictionary substitution rules

When loading a dictionary instance, Vocalizer consults each individual word and multi-word fragment in the input text, and determines whether a substitution is needed from the dictionary. It performs the following checks in order, and stops if a candidate if found:

Look for the string “as is” (with no modification).
Look for the string after removing leading and trailing quotes and brackets.
Look for the string with trailing dots removed.
Look for the lowercase form of the string.

Rules for matching the source string:

If the source string has one or more uppercase letters, the match is case-sensitive. For example, when the dictionary entry is "DLL", only the text input key "DLL" will match that entry.
If the source string has no uppercase letters, the match is not case-sensitive. For example, when the dictionary entry is "dll", text input keys such as "dll", "Dll", and "dLL" will match.

Note: It is possible for two separate dictionary entries to have source strings that differ only in casing. If there is a separate source string with uppercase letters that also matches the input, this uppercase string will take precedence.

Precedence rules for dictionary substitution:

When the same source string occurs more than once in the same subheader, the last occurrence determines the destination string.
When the same source string occurs in different subheaders with different content type (one phonetic and one orthographic), the occurrence in the first subheader determines the destination string.
Only complete words are matched. If the source string in the dictionary is a substring of a word in the input text, it will not be substituted.

Text format dictionary example

Following is an example input file for a dictionary.

Note: User dictionaries use 3-character Vocalizer language codes.

[Header]

Language = ENG

[SubHeader]

Content = EDCT_CONTENT_BROAD_NARROWS

Representation = EDCT_REPR_SZZ_STRING

[Data]

zero // #'zi.R+o&U#

addr // #'@.dR+Es#

adm // #@d.'2mI.n$.'stR+e&I.S$n#

[SubHeader]

Content=EDCT_CONTENT_ORTHOGRAPHIC

Representation=EDCT_REPR_SZ_STRING

[Data]

Info      Information

IT        "Information Technology"

DLL       "Dynamic Link Library"

A-level   "advanced level"

Afr       africa

Acc       account

TEL       telephone

Anon      anonymous

AP        "associated press"

Dictionary format for Vocalizer

Textual dictionaries must only be encoded in UTF-8. Note that they may not contain the 3-byte UTF-8 preamble, also known as the UTF-8 BOM or signature.

The general format of textual dictionaries consists of one [Header] label and its properties, and several [SubHeader]-[Data] label couples with their properties and data. Each [SubHeader] describes the expected data properties (such as orthographic or phonetic text) while [Data] describes the actual source string that needs to be replaced with a destination string.

You can represent the destination string of a dictionary entry using orthographic or phonetic text. For phonetic strings, you must use the L&H+ phonetic alphabet. For information about phonemes, see your Language Supplement.

The simplest dictionary consists of one [Header] label and one [Data] label; but while it’s syntactically correct, such a dictionary doesn’t specify any actions.

Here is an example of the format:

[Header]

Language = language_code

[SubHeader]

Content=content_type

Representation=representation_type

Language = language_code

[Data]

source_stringseparatordestination_string

Item	Description
language_code	Three-letter code used to identify the language; for example, ENU for American English. The language code is mandatory; it must be specified either in the header, or in each sub-header. Only one language may be used in each dictionary.
content_type	Type of content checked against the dictionary. There are two options: EDCT_CONTENT_ORTHOGRAPHIC for orthographic strings EDCT_CONTENT_BROAD_NARROWS for phonetic strings The content type determines the representation type. You must specify the content type in each sub-header in the dictionary.
representation_type	Representation type used for the output: EDCT_REPR_SZ_STRING if the content type is EDCT_CONTENT_ORTHOGRAPHIC EDCT_REPR_SZZ_STRING if the content type is EDCT_CONTENT_BROAD_NARROWS You must specify the representation type for each sub-header in the dictionary.
source_string	Source string that is to be replaced. If the string has multiple words, enclose them in double quotes ("). Optional. To add whitespace characters to a multi-word phrase, use the <ESC>\mw\ control sequence. (This is not required. This syntax is kept for compatibility with previous releases.)
separator	Separator between the source string and the destination string. This separator must be a tab character.
destination_string	One or more words to be used to replace the source string. If the string consists of phonetic symbols, precede with with two forward slashes (//). If the string has multiple words, enclose them in double quotes (").

Each dictionary can include several sub-header sections; each sub-header can include several data sections; and each data section can include several different source/destination string pairings. Each source/destination string pair must appear on a separate line within the data section.

Here is an example of a short dictionary:

[Header]

Language = ENU

[SubHeader]

Content=EDCT_CONTENT_ORTHOGRAPHIC

Representation=EDCT_REPR_SZ_STRING

[Data]

DLLDynamic Link Library

HelloWelcome to the demonstration of the American English Text-to-Speech system.

infoInformation

[SubHeader]

Content = EDCT_CONTENT_BROAD_NARROWS

Representation = EDCT_REPR_SZZ_STRING

[Data]

addr  // '@.dR+Es

Using the dictionary conversion tool (dictcpl)

The conversion tool (dictcpl) can convert a text format dictionary to a binary dictionary. The tool is a console program, so you can open a DOS prompt to run it:

dictcpl -o binary-dictionary text-dictionary

Options:

Option	Description
-o	Binary dictionary (output)

For example:

> dictcpl -o userdct_enu.bdc enu.txt

Loading user dictionaries

Use these items to load user dictionaries at runtime:

SSML <lexicon> element
The <default_dictionaries> XML configuration file parameter.

You can load any number of user dictionaries at runtime. The load order determines the precedence, with more recently loaded dictionaries having precedence over previously loaded dictionaries.

The runtime consults only user dictionaries whose language matches the current synthesis language.

Restrictions on user dictionaries

You cannot call dictionary functions on a TTS engine instance that is in the state of processing.

Specifying pronunciations with user dictionaries

Related topics