SWIrcnd—recognition end

This event is logged at the end of recognition.

Note: The entries in the log are not guaranteed to be sorted by the nbest result at the time when SWIrcnd is printed.

In addition to the Tokens used for every event, this event has the following tokens:

Token	Meaning
BORT	Beginning of recognition time (when the recognizer first processed the signal).
CONF	Confidence value for n-best item. Values can range from 0 to 999.
DPNM	Root name of the diphone acoustic models used to recognize the top choice on the n-best list. (If there is no applicable value to report, a value of NA is used.)
DURS	Amount of speech processed by the recognizer in milliseconds. The value can sometimes exceed EOSS by small amounts. See Measuring latency with EORT and EOSS.
ENDR	See Reasons for end of speech.
EORT	End-of-recognition time in milliseconds. Clock time when the results are ready. Measured in real time from the arrival of the first packet of the input stream. See Measuring latency with EORT and EOSS.
EOSD	How much speech data was passed to the endpointer before EOS was determined. This token helps determine latency due to endpointer decision-making (mostly end of speech timeout). If EOSD equals EOSS then something unusual caused the end-of-speech; for example, the maximum speech duration timer expired. See Measuring latency with EORT and EOSS.
EOSS	End-of-speech signal: where in the input stream the endpointer wanted the recognizer to stop. See Measuring latency with EORT and EOSS.
EOST	End-of-speech time in milliseconds. Clock time when the endpointer determined the end of caller speech; measured in real time from the arrival of the first packet; delays in the audio path are not counted. See Measuring latency with EORT and EOSS.
GRMR	Grammar for n-best item.
KEYS	List of key/value pairs for the top result.
LA	Value of the swirec_load_adjusted_speedvsaccuracy parameter used for the recognition. Values include: idle normal busy Xidle Xnormal Xbusy "X" values indicate that the parameter specified that value. Values without "X" were determined at runtime with the parameter setting "on."
MACC	Filename of the statistics file (the monophone accumulator) that tuned the acoustic model used for the recognition event. (Also, see the DACC token).
MDVR	Model version—version stamp of models. Format is L.M.m.s, where L is language number, M is major version, m is minor version, and s is the set number.
MEDIA	An audio media type. For example, "MEDIA=audio/basic;rate:8000"
MPNM	Indicates the acoustic models used for generating the recognition result. Contains a comma separated list showing the language and acoustic model filenames used for first-pass recognition processing to get the top choice on the n-best list. Each list element has the format LangCode/Version/Path/Filename. (If there is no applicable value to report, a value of NA is used.) For example: MPNM=en.us/10.0.0/models/FirstPass/models.hmm,de.de/10.0.0/models/FirstPass/models.hmm
NBST	Number of n-best items. Used only if RSTT is "ok" or "lowconf."
OFFS	For internal use only. Shows an offset value for acoustic models. For example, "OFFS=1.3".
RAWS	Raw score for n-best item.
RAWT	Raw text for n-best item; set to the value of the SWI_literal key. See Measuring latency with EORT and EOSS.
RCPU	Recognizer CPU time in milliseconds. Measures how much CPU was used for the recognition.
RENR	See Reasons for end of recognition.
RSLT	Parsed text for n-best item.
RSTT	See Return codes.
SAFEK	Parsed text for n-best item. Used only if the grammar sets SWI_safeKey. Typically, the key passes a partial recognition result when passing the whole result might be a security risk.
SCAL	For internal use only. Shows a multiplier for acoustic scale. For example, "SCAL=5.5".
SECURE	Indicates that sensitive information is suppressed for this event. The token only appears when true.
SPAG	The second pass has not modified the result of the first pass. When the recognizer is "unsure" about the accuracy of the nbest list, it invokes a second pass through the data to help improve the accuracy. A second pass uses more CPU and may also presage a low-confidence recognition.
SPIV	The second pass has been invoked. When the recognizer is "unsure" about the accuracy of the nbest list, it invokes a second pass through the data to help improve the accuracy. A second pass uses more CPU and may also presage a low-confidence recognition.
SPMS	Second-pass models. Contains a comma separated list showing the language and acoustic model used to recognize the top choice on the n-best list. When this token appears in the log, it confirms the recognizer performed second-pass processing. It does not appear when recognition completes after the first-pass (see MPNM). (Because the n-best can change during the second pass, MPNM and SPMS might not be consistent. For example, they might refer to different languages.) Each list element has the format LangCode/Version/Path/Filename. (If there is no applicable value to report, a value of NA is used.) For example: SPMS=en.us/10.0.0/models/SecondPass1/models1.hmm,de.de/10.0.0/models/SecondPass1/models1.hmm
SPOK	Normalized raw text for n-best item; set to the value of the SWI_spoken key. See Measuring latency with EORT and EOSS.
WVNM	Waveform name.

Recognition-end example

Below is a sample recognition-end (SWIrcnd) event:

TIME=20010816125814573|CHAN=1|EVNT=SWIrcnd|RSTT=ok|NBST=3|

RSLT=????0130|RAWT=january thirtieth|SPOK=january thirtieth|

GRMR=GURI0|KEYS=<YEAR conf="988">????</YEAR>

<CENTURY conf="988">??</CENTURY><TWO_DIGIT_YEAR conf="988">??

</TWO_DIGIT_YEAR><MONTH conf="988">01</MONTH>

<DAY conf="982">30</DAY><WEEKDAY conf="988">?</WEEKDAY>

<SWI_disallow conf="988">0</SWI_disallow>

<SWI_scoreDelta conf="988">0</SWI_scoreDelta>

<MEANING conf="982">????0130</MEANING>|CONF=982|RAWS=3150|

RSLT=????0131|RAWT=january thirty_first|

SPOK=january thirty_first|GRMR=GURI0|CONF=72|RAWS=1856|

MDVR=1.7.0.0|MPNM=en-us/SpeechPearl|

MACC=noise.intmodels.stats.20021204000834|

DACC=NULL|EOSS=1624|DURS=1624|EOSD=1624|BORT=90|EOST=130|

EORT=891|CPAR=0.315,0.863,-0.223,0.743,0.450,0.156,1.000,0.098|

LA=idle|OFFS=1.3|SCAL=5.5|AWP=1300|FMM=10000|SLM=32|MFP=5000|UCPU=711|SCPU=30

In this case, the recognition engine came up with three possible answers, the top two in separate RSLT tokens. (By default, only the top two results are logged, although NBST=3 in this example. Use swirec_max_logged_nbest to change the number of logged results.) "????0130" was the top n-best result with confidence score (CONF) of 982. Note that the confidence for a particular key can be higher than the overall confidence level for the entire utterance. For example, 01 (month) can have equal or higher confidence than overall confidence level, but not lower.

The key/value pairs for the top result appear in the KEYS token in an XML format. For each key, the confidence score is listed in the "conf" attribute. The SWI_meaning SWI_literal, and SWI_spoken keys are not printed for the KEYS field, since they are printed in the RSLT, RAWT, and SPOK fields, respectively.

For each item on the n-best list, we can identify which active grammar produced it by putting in the appropriate GURIx value, which was specified in the SWIrcst event, into the GRMR field. Thus, we see that the first two results used GURI0 (builtin:date grammar from the SWIrcst example above). If we apply the date grammar to the raw text "january thirtieth," this result is "????0130."

Reasons for end of speech

Reasons for the end of speech (ENDR) include:

Return code	Status
ctimeout	The end of speech was detected (completetimeout was triggered).
eeos	External end of speech. The audio sample sent to the recognizer was labeled as the last sample.
itimeout	Normal end of speech.
maxs	The maximum speech time was reached (maxspeechtimeout).
nobos	No beginning of speech detected.

Reasons for end of recognition

Reasons for the end of recognition (RENR) include:

Return code	Status
count	The maximum sentences were reached. (The max is determined by internal algorithms; this is not swirec_max_sentences.)
err	A system error occurred.
maxc	The maximum CPU time was reached.
maxsrch	Recognizer’s maximum allowed search time was reached.
maxsent	The number of sentences tried.
ok	Recognition was successful. There is an n-best result.
prun	Stopped generating the n-best list. This can occur even if no n-best entries returned. One cause is that the pruning threshold was exceeded (swirec_state_beam). But typically, it simply means that there were no more hypotheses to consider. For example, this happens if requesting an n-best size of n but the grammar has fewer than n choices. It will also happen if the recognizer has found a compelling acoustic match so that all the other hypotheses are pruned in the first pass search.
stop	Recognizer received a stop request.

Return codes

Return codes (RSTT) are as follows:

Return code	Status
serr	A system error occurred.
lowconf	There was an n-best result (including any possible decoys), but it was below the setting of the confidencelevel parameter.
maxc	The maximum CPU time was reached (swirec_max_cpu_time).
nomatch	There was no recognition match, and no n-best result.
ok	Recognition was successful. There is an n-best result.
stop	Recognizer received a stop request.

SWIrcnd—recognition end

Related topics