Meeting Materials

From speechsc-bounces@ietf.org Sun Nov 06 14:33:30 2005 Received: from localhost.cnri.reston.va.us ([127.0.0.1] helo=megatron.ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1EYqGg-0000C9-5q; Sun, 06 Nov 2005 14:33:30 -0500 Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1EYqGe-0000Bf-1h for speechsc@megatron.ietf.org; Sun, 06 Nov 2005 14:33:28 -0500 Received: from ietf-mx.ietf.org (ietf-mx [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id OAA16370 for ; Sun, 6 Nov 2005 14:33:02 -0500 (EST) Received: from mxgate1.brooktrout.com ([204.176.74.10]) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1EYqW6-0002ny-NM for speechsc@ietf.org; Sun, 06 Nov 2005 14:49:26 -0500 X-IronPort-AV: i="3.97,298,1125892800"; d="scan'208,217"; a="22342946:sNHT56851728" X-MimeOLE: Produced By Microsoft Exchange V6.5.7226.0 Content-class: urn:content-classes:message MIME-Version: 1.0 Date: Sun, 6 Nov 2005 14:32:47 -0500 Message-ID: <330A23D8336C0346B5C1A5BB196666476210EF@ATLANTIS.Brooktrout.com> Thread-Topic: Meeting Materials Thread-Index: AcXjCN5J7tNMGvuRR1aFdSzy7oUZRQ== From: "Burger, Eric" To: "IETF SPEECHSC" X-Spam-Score: 0.8 (/) X-Scan-Signature: 4adaf050708fb13be3316a9eee889caa Subject: [Speechsc] Meeting Materials X-BeenThere: speechsc@ietf.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Speech Services Control Working Group List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Content-Type: multipart/mixed; boundary="===============0030645632==" Sender: speechsc-bounces@ietf.org Errors-To: speechsc-bounces@ietf.org This is a multi-part message in MIME format. --===============0030645632== Content-class: urn:content-classes:message Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01C5E308.DE499E1E" This is a multi-part message in MIME format. ------_=_NextPart_001_01C5E308.DE499E1E Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable If you haven't been to the supplemental site for a while, the meeting = materials for IETF 64 are now at: -- Sent from my Palm Treo - Sorry if Terse ------_=_NextPart_001_01C5E308.DE499E1E Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Meeting Materials

If you haven't been to the supplemental site for a = while, the meeting materials for IETF 64 are now at:

<https://datatracker.ietf.org/public/meeting_materials.cgi?meet= ing_num=3D64>

--
Sent from my Palm Treo - Sorry if Terse

------_=_NextPart_001_01C5E308.DE499E1E-- --===============0030645632== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Speechsc mailing list Speechsc@ietf.org https://www1.ietf.org/mailman/listinfo/speechsc --===============0030645632==-- From speechsc-bounces@ietf.org Mon Nov 07 08:46:45 2005 Received: from localhost.cnri.reston.va.us ([127.0.0.1] helo=megatron.ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1EZ7Kf-0005T7-C7; Mon, 07 Nov 2005 08:46:45 -0500 Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1EZ7Ke-0005T2-2u for speechsc@megatron.ietf.org; Mon, 07 Nov 2005 08:46:44 -0500 Received: from ietf-mx.ietf.org (ietf-mx [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id IAA10149 for ; Mon, 7 Nov 2005 08:46:17 -0500 (EST) Received: from mmail.voxeo.com ([66.193.54.209] helo=voxeo.com) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1EZ7aF-0003xA-BW for speechsc@ietf.org; Mon, 07 Nov 2005 09:02:52 -0500 Received: from [66.193.54.239] (account rj HELO [172.16.240.39]) by voxeo.com (CommuniGate Pro SMTP 4.2.10) with ESMTP-TLS id 4844165 for speechsc@ietf.org; Mon, 07 Nov 2005 08:46:23 -0500 Mime-Version: 1.0 (Apple Message framework v746.2) Content-Transfer-Encoding: 7bit Message-Id: Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed To: speechsc@ietf.org From: RJ Auburn Date: Mon, 7 Nov 2005 08:46:19 -0500 X-Mailer: Apple Mail (2.746.2) X-Spam-Score: 0.0 (/) X-Scan-Signature: 7d33c50f3756db14428398e2bdedd581 Content-Transfer-Encoding: 7bit Subject: [Speechsc] Vendor specific params X-BeenThere: speechsc@ietf.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Speech Services Control Working Group List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: speechsc-bounces@ietf.org Errors-To: speechsc-bounces@ietf.org Folks, Maybe I missed something but is there a reason that the vendor specific params are only allowed on get/set-params and not on other requests such as recognize? Seems to be a bit of an oversight that we only allow the vendor headers on get/set requests when we allow so many other config prams on the other requests. Thoughts? RJ --- RJ Auburn CTO, Voxeo Corporation tel:+1-407-418-1800 _______________________________________________ Speechsc mailing list Speechsc@ietf.org https://www1.ietf.org/mailman/listinfo/speechsc From speechsc-bounces@ietf.org Mon Nov 07 09:33:55 2005 Received: from localhost.cnri.reston.va.us ([127.0.0.1] helo=megatron.ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1EZ84J-0005QF-LU; Mon, 07 Nov 2005 09:33:55 -0500 Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1EZ84I-0005Pl-4S for speechsc@megatron.ietf.org; Mon, 07 Nov 2005 09:33:54 -0500 Received: from ietf-mx.ietf.org (ietf-mx [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id JAA12949 for ; Mon, 7 Nov 2005 09:33:21 -0500 (EST) Received: from mmail.voxeo.com ([66.193.54.209] helo=voxeo.com) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1EZ8Jl-0005JO-H3 for speechsc@ietf.org; Mon, 07 Nov 2005 09:49:56 -0500 Received: from [66.193.54.239] (account rj HELO [172.16.240.39]) by voxeo.com (CommuniGate Pro SMTP 4.2.10) with ESMTP-TLS id 4844282; Mon, 07 Nov 2005 09:33:18 -0500 In-Reply-To: <7DE7C4EF3B7C8B4B82955191378290D8036BCD83@mtb1exch01.nuance.com> References: <7DE7C4EF3B7C8B4B82955191378290D8036BCD83@mtb1exch01.nuance.com> Mime-Version: 1.0 (Apple Message framework v746.2) Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed Message-Id: <2DDD23EC-CA69-44E6-95C5-DF5F7FF39EEC@voxeo.com> Content-Transfer-Encoding: 7bit From: RJ Auburn Subject: Re: [Speechsc] Vendor specific params Date: Mon, 7 Nov 2005 09:33:18 -0500 To: "Pierre Forgues" X-Mailer: Apple Mail (2.746.2) X-Spam-Score: 0.0 (/) X-Scan-Signature: 52f7a77164458f8c7b36b66787c853da Content-Transfer-Encoding: 7bit Cc: speechsc@ietf.org X-BeenThere: speechsc@ietf.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Speech Services Control Working Group List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: speechsc-bounces@ietf.org Errors-To: speechsc-bounces@ietf.org Pierre, Thanks for the response. The current draft of the spec says that it's only allowed on get/set-params so this is why I was asking. If everyone is in agreement we should update the spec to allow it on the other methods. Thanks, RJ --- RJ Auburn CTO, Voxeo Corporation tel:+1-407-418-1800 On Nov 7, 2005, at 9:04 AM, Pierre Forgues wrote: > It should be possible to set vendor-specific parameters in > RECOGNIZE or > SPEAK operations as well as SET/GET operations. In my opinion, we > should clarify the MRCP specification regarding this. > > Pierre > > -----Original Message----- > From: speechsc-bounces@ietf.org [mailto:speechsc-bounces@ietf.org] On > Behalf Of RJ Auburn > Sent: Monday, November 07, 2005 8:46 AM > To: speechsc@ietf.org > Subject: [Speechsc] Vendor specific params > > Folks, > > Maybe I missed something but is there a reason that the vendor > specific params are only allowed on get/set-params and not on other > requests such as recognize? > > Seems to be a bit of an oversight that we only allow the vendor > headers on get/set requests when we allow so many other config prams > on the other requests. > > Thoughts? > > RJ > > --- > RJ Auburn > CTO, Voxeo Corporation > tel:+1-407-418-1800 > > > > > _______________________________________________ > Speechsc mailing list > Speechsc@ietf.org > https://www1.ietf.org/mailman/listinfo/speechsc > > > > _______________________________________________ Speechsc mailing list Speechsc@ietf.org https://www1.ietf.org/mailman/listinfo/speechsc From speechsc-bounces@ietf.org Mon Nov 07 10:16:58 2005 Received: from localhost.cnri.reston.va.us ([127.0.0.1] helo=megatron.ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1EZ8jy-0006ja-Dj; Mon, 07 Nov 2005 10:16:58 -0500 Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1EZ7fS-0000vO-HO for speechsc@megatron.ietf.org; Mon, 07 Nov 2005 09:08:14 -0500 Received: from ietf-mx.ietf.org (ietf-mx [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id JAA11650 for ; Mon, 7 Nov 2005 09:07:48 -0500 (EST) Received: from letter.nuance.com ([207.107.210.132]) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1EZ7v4-0004eI-45 for speechsc@ietf.org; Mon, 07 Nov 2005 09:24:23 -0500 Received: from postcard.nuance.com ([10.3.6.20]:18689) by letter.nuance.com with esmtp id 1EZ7c7-0006Xk-Lv; Mon, 07 Nov 2005 06:04:47 -0800 Received: from mtb1exch01.nuance.com ([10.3.2.6]) by postcard.nuance.com with Microsoft SMTPSVC(6.0.3790.0); Mon, 7 Nov 2005 09:04:42 -0500 X-MimeOLE: Produced By Microsoft Exchange V6.5.7226.0 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Subject: RE: [Speechsc] Vendor specific params Date: Mon, 7 Nov 2005 09:04:06 -0500 Message-ID: <7DE7C4EF3B7C8B4B82955191378290D8036BCD83@mtb1exch01.nuance.com> Thread-Topic: [Speechsc] Vendor specific params Thread-Index: AcXjoe9JxHVtc4vRQICsbn7oVB2GzQAAgdRQ From: "Pierre Forgues" To: "RJ Auburn" , X-OriginalArrivalTime: 07 Nov 2005 14:04:42.0486 (UTC) FILETIME=[334EA160:01C5E3A4] X-FromHost: postcard.nuance.com [10.3.6.20]:18689 Lines: 43 X-Spam-Score: 0.0 (/) X-Scan-Signature: 9ed51c9d1356100bce94f1ae4ec616a9 Content-Transfer-Encoding: quoted-printable X-Mailman-Approved-At: Mon, 07 Nov 2005 10:16:57 -0500 Cc: X-BeenThere: speechsc@ietf.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Speech Services Control Working Group List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: speechsc-bounces@ietf.org Errors-To: speechsc-bounces@ietf.org It should be possible to set vendor-specific parameters in RECOGNIZE or SPEAK operations as well as SET/GET operations. In my opinion, we should clarify the MRCP specification regarding this. Pierre -----Original Message----- From: speechsc-bounces@ietf.org [mailto:speechsc-bounces@ietf.org] On Behalf Of RJ Auburn Sent: Monday, November 07, 2005 8:46 AM To: speechsc@ietf.org Subject: [Speechsc] Vendor specific params Folks, Maybe I missed something but is there a reason that the vendor =20 specific params are only allowed on get/set-params and not on other =20 requests such as recognize? Seems to be a bit of an oversight that we only allow the vendor =20 headers on get/set requests when we allow so many other config prams =20 on the other requests. Thoughts? RJ --- RJ Auburn CTO, Voxeo Corporation tel:+1-407-418-1800 _______________________________________________ Speechsc mailing list Speechsc@ietf.org https://www1.ietf.org/mailman/listinfo/speechsc =20 =20 _______________________________________________ Speechsc mailing list Speechsc@ietf.org https://www1.ietf.org/mailman/listinfo/speechsc From speechsc-bounces@ietf.org Mon Nov 07 10:25:09 2005 Received: from localhost.cnri.reston.va.us ([127.0.0.1] helo=megatron.ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1EZ8rt-0008ID-2X; Mon, 07 Nov 2005 10:25:09 -0500 Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1EZ8rr-0008Gm-DE for speechsc@megatron.ietf.org; Mon, 07 Nov 2005 10:25:07 -0500 Received: from ietf-mx.ietf.org (ietf-mx [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id KAA16259 for ; Mon, 7 Nov 2005 10:24:42 -0500 (EST) Received: from mxgate1.brooktrout.com ([204.176.74.10]) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1EZ97T-0006sP-J5 for speechsc@ietf.org; Mon, 07 Nov 2005 10:41:16 -0500 X-IronPort-AV: i="3.97,300,1125892800"; d="scan'208"; a="22377428:sNHT27882192" X-MimeOLE: Produced By Microsoft Exchange V6.5.7226.0 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Date: Mon, 7 Nov 2005 10:24:58 -0500 Message-ID: <330A23D8336C0346B5C1A5BB196666470184FE17@ATLANTIS.Brooktrout.com> Thread-Topic: Session Streaming Thread-Index: AcXjOsPJCyhkEOCPRNeBx6Nlg3zXqQ== From: "Burger, Eric" To: X-Spam-Score: 0.0 (/) X-Scan-Signature: 2870a44b67ee17965ce5ad0177e150f4 Content-Transfer-Encoding: quoted-printable Subject: [Speechsc] Session Streaming X-BeenThere: speechsc@ietf.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Speech Services Control Working Group List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: speechsc-bounces@ietf.org Errors-To: speechsc-bounces@ietf.org If you are not able to visit with us in Vancouver, the session will be streamed live. Check out http://videolab.uoregon.edu/events/ietf _______________________________________________ Speechsc mailing list Speechsc@ietf.org https://www1.ietf.org/mailman/listinfo/speechsc From speechsc-bounces@ietf.org Mon Nov 07 12:17:44 2005 Received: from localhost.cnri.reston.va.us ([127.0.0.1] helo=megatron.ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1EZAcq-0008Am-Jc; Mon, 07 Nov 2005 12:17:44 -0500 Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1EZAcm-0008AT-GY for speechsc@megatron.ietf.org; Mon, 07 Nov 2005 12:17:42 -0500 Received: from ietf-mx.ietf.org (ietf-mx [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id MAA23469 for ; Mon, 7 Nov 2005 12:17:14 -0500 (EST) Received: from fw01.db01.voxpilot.com ([212.17.54.82] helo=mail.voxpilot.com) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1EZAsO-0001q3-1l for speechsc@ietf.org; Mon, 07 Nov 2005 12:33:50 -0500 Received: from daburkewxp (unknown [66.153.100.53]) by mail.voxpilot.com (Postfix) with ESMTP id 80897214041; Mon, 7 Nov 2005 17:17:17 +0000 (GMT) Message-ID: <020101c5e3bf$1a590ef0$7f04a8c0@db01.voxpilot.com> From: "Dave Burke" To: "Andrew Wahbe" , "IETF SPEECHSC (E-mail)" References: <436190C3.2040908@voicegenie.com> <43619B00.3050401@voicegenie.com> <43619C00.4040209@voicegenie.com> Subject: Re: [speechsc] VoiceXML maxspeech timeout not implementable withMRCPv2 Date: Mon, 7 Nov 2005 17:17:14 -0000 MIME-Version: 1.0 Content-Type: text/plain; format=flowed; charset="iso-8859-1"; reply-type=original Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2900.2180 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.2180 X-Spam-Score: 0.0 (/) X-Scan-Signature: 932cba6e0228cc603da43d861a7e09d8 Content-Transfer-Encoding: 7bit Cc: X-BeenThere: speechsc@ietf.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Speech Services Control Working Group List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: speechsc-bounces@ietf.org Errors-To: speechsc-bounces@ietf.org I agree with Andrew's analysis and that his proposed text clarification is warranted, i.e. "003 recognition-timeout: RECOGNIZE completed without a match due to a recognition-timeout" Dave ----- Original Message ----- From: "Andrew Wahbe" To: "IETF SPEECHSC (E-mail)" Sent: Friday, October 28, 2005 3:33 AM Subject: Re: [speechsc] VoiceXML maxspeech timeout not implementable withMRCPv2 > Man I should stop writing emails so late at night: > My new proposed text for 003 wasn't changed at all. This is what I am > proposing: > > 003 recognition-timeout: RECOGNIZE completed without a match due to a > recognition-timeout > > Andrew > > Andrew Wahbe wrote: > >> I just noticed that my completion cause code analysis was wrong in one >> case: >> >> - between drafts 06 and 07 it seems the definition of Speech >> Incomplete Timeout (Section 9.4.16) was changed so that >> "partial-match" (013) was returned when it fired instead of "no-match" >> (001). >> >> So I guess what we are saying is that another way of terminating >> recognition is that the speech is just not matching the grammar at all >> and that is what yeilds no-match (001). (Though I could have sworn >> that a recognizer waits until you were done talking even if the >> utterance wasn't matching at all and that incomplete timeout was used >> in that case. Anyone care to clarify?) >> >> Anyways, I don't think this changes my real point below. >> >> Andrew >> >> Andrew Wahbe wrote: >> >>> Actually, this is something I've raised before with the 07 draft (see >>> http://www1.ietf.org/mail-archive/web/speechsc/current/msg01510.html) >>> but is still not fixed in 08. >>> >>> I don't see a way to implement the VoiceXML maxspeech timeout with >>> the current recognition completion cause codes. The semantics of the >>> Recognition Timer match those of VoiceXML maxspeech timeout. There >>> are currently 2 cause codes related to this timer: >>> >>> 003 recognition-timeout: RECOGNIZE in hotword mode completed without >>> a match due to a recognition-timeout >>> >>> 008 success-maxtime: RECOGNIZE request terminated because speech was >>> too long but whatever was spoken till that point was a full match. >>> >>> But what should be thrown in a non-hotword recognition when the timer >>> fires and there is no match? It can't be 001 no-match; the VoiceXML >>> interpreter will not be able to distinguish a "nomatch" from >>> "maxspeech" in this case. >>> >>> Let's take a step back. If we really want to describe what happened, >>> then I think we need to communicate to the client 1) why the >>> recognition was stopped and 2) if the result a no-match, a partial >>> match, or a complete match when it stopped. >>> >>> 1) The recognition can stop for the following reasons (the related >>> cause codes in parens): >>> >>> no-input timeout (002) >>> complete timeout (000) >>> incomplete timeout (001 013) >>> recognition timeout (003 and 008) >>> speech too early (007) >>> cancelled (011) >>> various errors (004, 005, 006, 009, 010, 012) >>> >>> 2) The result is irrelevant for no-input, speech too early, >>> cancelled, and the errors. This leaves complete timeout, incomplete >>> timeout, and recognition timeout. By the definition of complete >>> timeout, this is always a complete match (000). Also by definition, >>> incomplete timeout could result in a no match (001) or a partial >>> match (013) -- though VoiceXML will likely throw nomatch for either. >>> >>> So we are left with recognition timeout. It could be a complete match >>> (008 by the definition in section 9.4.11) or a partial match (008 by >>> the definition in section 9.4.7), or a no-match... well if it was a >>> hotword recognition we have 003; for the normal case, we have nothing. >>> >>> So hopefully I have clarified my earlier point that Recognition >>> Complete Cause codes 003 and 008 need revisiting. Sorry to be so long >>> winded but I think it's important that VoiceXML be implementable >>> using MRCPv2; it is a very common use case for the specification. >>> >>> Before proposing a solution, I would like to again ask why we need to >>> have a special cause code for recognition-timeout in the hotword >>> case. The answer to this question is here: >>> http://www1.ietf.org/mail-archive/web/speechsc/current/msg01501.html >>> >>> But I still don't see why you need a special completion cause to say >>> that the timer expired in hotword mode. The client knows what mode >>> the recognition is in. You just need to tell it that the timer fired. >>> >>> What the client does need to know is if there is a valid result for >>> it to process. Thus I propose that 003 be rewritten as follows: >>> >>> 003 recognition-timeout: RECOGNIZE in hotword mode completed without >>> a match due to a recognition-timeout >>> >>> This will bring it back in line with the earlier Scansoft & Nuance >>> proposal (well now it just a Nuance proposal) from a year and a half >>> ago: >>> http://www1.ietf.org/mail-archive/web/speechsc/current/msg00560.html >>> >>> Andrew >>> >>> _______________________________________________ >>> Speechsc mailing list >>> Speechsc@ietf.org >>> https://www1.ietf.org/mailman/listinfo/speechsc >>> >>> > -------------------------------------------------------------------------------- > _______________________________________________ > Speechsc mailing list > Speechsc@ietf.org > https://www1.ietf.org/mailman/listinfo/speechsc > _______________________________________________ Speechsc mailing list Speechsc@ietf.org https://www1.ietf.org/mailman/listinfo/speechsc From speechsc-bounces@ietf.org Mon Nov 07 13:54:22 2005 Received: from localhost.cnri.reston.va.us ([127.0.0.1] helo=megatron.ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1EZC8M-0006v0-1F; Mon, 07 Nov 2005 13:54:22 -0500 Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1EZC8K-0006uv-I0 for speechsc@megatron.ietf.org; Mon, 07 Nov 2005 13:54:20 -0500 Received: from ietf-mx.ietf.org (ietf-mx [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id NAA29219 for ; Mon, 7 Nov 2005 13:53:54 -0500 (EST) Received: from sj-iport-4.cisco.com ([171.68.10.86]) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1EZCNy-0004ct-Op for speechsc@ietf.org; Mon, 07 Nov 2005 14:10:31 -0500 Received: from sj-core-3.cisco.com ([171.68.223.137]) by sj-iport-4.cisco.com with ESMTP; 07 Nov 2005 10:54:10 -0800 Received: from vtg-um-e2k6.sj21ad.cisco.com (vtg-um-e2k6.cisco.com [171.70.93.77]) by sj-core-3.cisco.com (8.12.10/8.12.6) with ESMTP id jA7Is4QQ001785; Mon, 7 Nov 2005 10:54:04 -0800 (PST) X-MimeOLE: Produced By Microsoft Exchange V6.5.7226.0 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Subject: RE: [speechsc] VoiceXML maxspeech timeout not implementable withMRCPv2 Date: Mon, 7 Nov 2005 10:54:06 -0800 Message-ID: <03772D1EC8DE624A863058C75874A75C739ECE@vtg-um-e2k6.sj21ad.cisco.com> Thread-Topic: [speechsc] VoiceXML maxspeech timeout not implementable withMRCPv2 Thread-Index: AcXjv+z4t3Gy6OPdT+SBCfRZty8yUgADA4mA From: "Shanmugham, Saravanan" To: "Dave Burke" , "Andrew Wahbe" , "IETF SPEECHSC $E-mail$" X-Spam-Score: 0.0 (/) X-Scan-Signature: f884eb1d4ec5a230688d7edc526ea665 Content-Transfer-Encoding: quoted-printable Cc: X-BeenThere: speechsc@ietf.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Speech Services Control Working Group List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: speechsc-bounces@ietf.org Errors-To: speechsc-bounces@ietf.org I understand and see his point. I think what he is proposing is what it used to be before it was changed based on previous feed back. Hence I want to be carefull before making this change.=20 I have this as the only Open Issue in my slides for the SpeechSC meeting. If there is concerns raised this is a simple enough change to get into -09. Sarvi=20 =20 -----Original Message----- From: speechsc-bounces@ietf.org=20 [mailto:speechsc-bounces@ietf.org] On Behalf Of Dave Burke Sent: Monday, November 07, 2005 9:17 AM To: Andrew Wahbe; IETF SPEECHSC (E-mail) Subject: Re: [speechsc] VoiceXML maxspeech timeout not=20 implementable withMRCPv2 =20 I agree with Andrew's analysis and that his proposed text=20 clarification is warranted, i.e. =20 "003 recognition-timeout: RECOGNIZE completed without a=20 match due to a recognition-timeout" =20 Dave =20 ----- Original Message ----- From: "Andrew Wahbe" To: "IETF SPEECHSC (E-mail)" Sent: Friday, October 28, 2005 3:33 AM Subject: Re: [speechsc] VoiceXML maxspeech timeout not=20 implementable withMRCPv2 =20 =20 > Man I should stop writing emails so late at night: > My new proposed text for 003 wasn't changed at all. This=20 is what I am > proposing: > > 003 recognition-timeout: RECOGNIZE completed without a=20 match due to a=20 > recognition-timeout > > Andrew > > Andrew Wahbe wrote: > >> I just noticed that my completion cause code analysis=20 was wrong in=20 >> one >> case: >> >> - between drafts 06 and 07 it seems the definition of Speech=20 >> Incomplete Timeout (Section 9.4.16) was changed so that=20 >> "partial-match" (013) was returned when it fired=20 instead of "no-match" >> (001). >> >> So I guess what we are saying is that another way of=20 terminating=20 >> recognition is that the speech is just not matching the=20 grammar at=20 >> all and that is what yeilds no-match (001). (Though I=20 could have=20 >> sworn that a recognizer waits until you were done=20 talking even if the=20 >> utterance wasn't matching at all and that incomplete=20 timeout was used=20 >> in that case. Anyone care to clarify?) >> >> Anyways, I don't think this changes my real point below. >> >> Andrew >> >> Andrew Wahbe wrote: >> >>> Actually, this is something I've raised before with=20 the 07 draft=20 >>> (see >>>=20 http://www1.ietf.org/mail-archive/web/speechsc/current/msg0 1510.html >>> ) >>> but is still not fixed in 08. >>> >>> I don't see a way to implement the VoiceXML maxspeech=20 timeout with=20 >>> the current recognition completion cause codes. The=20 semantics of the=20 >>> Recognition Timer match those of VoiceXML maxspeech=20 timeout. There=20 >>> are currently 2 cause codes related to this timer: >>> >>> 003 recognition-timeout: RECOGNIZE in hotword mode=20 completed without=20 >>> a match due to a recognition-timeout >>> >>> 008 success-maxtime: RECOGNIZE request terminated=20 because speech was=20 >>> too long but whatever was spoken till that point was a=20 full match. >>> >>> But what should be thrown in a non-hotword recognition=20 when the=20 >>> timer fires and there is no match? It can't be 001=20 no-match; the=20 >>> VoiceXML interpreter will not be able to distinguish a=20 "nomatch"=20 >>> from "maxspeech" in this case. >>> >>> Let's take a step back. If we really want to describe=20 what happened,=20 >>> then I think we need to communicate to the client 1) why the=20 >>> recognition was stopped and 2) if the result a=20 no-match, a partial=20 >>> match, or a complete match when it stopped. >>> >>> 1) The recognition can stop for the following reasons=20 (the related=20 >>> cause codes in parens): >>> >>> no-input timeout (002) >>> complete timeout (000) >>> incomplete timeout (001 013) >>> recognition timeout (003 and 008) >>> speech too early (007) >>> cancelled (011) >>> various errors (004, 005, 006, 009, 010, 012) >>> >>> 2) The result is irrelevant for no-input, speech too early,=20 >>> cancelled, and the errors. This leaves complete=20 timeout, incomplete=20 >>> timeout, and recognition timeout. By the definition of=20 complete=20 >>> timeout, this is always a complete match (000). Also=20 by definition,=20 >>> incomplete timeout could result in a no match (001) or=20 a partial=20 >>> match (013) -- though VoiceXML will likely throw=20 nomatch for either. >>> >>> So we are left with recognition timeout. It could be a=20 complete=20 >>> match >>> (008 by the definition in section 9.4.11) or a partial=20 match (008 by=20 >>> the definition in section 9.4.7), or a no-match... =20 well if it was a=20 >>> hotword recognition we have 003; for the normal case,=20 we have nothing. >>> >>> So hopefully I have clarified my earlier point that=20 Recognition=20 >>> Complete Cause codes 003 and 008 need revisiting.=20 Sorry to be so=20 >>> long winded but I think it's important that VoiceXML be=20 >>> implementable using MRCPv2; it is a very common use=20 case for the specification. >>> >>> Before proposing a solution, I would like to again ask=20 why we need=20 >>> to have a special cause code for recognition-timeout=20 in the hotword=20 >>> case. The answer to this question is here: >>>=20 http://www1.ietf.org/mail-archive/web/speechsc/current/msg0 1501.html >>> >>> But I still don't see why you need a special=20 completion cause to say=20 >>> that the timer expired in hotword mode. The client=20 knows what mode=20 >>> the recognition is in. You just need to tell it that=20 the timer fired. >>> >>> What the client does need to know is if there is a=20 valid result for=20 >>> it to process. Thus I propose that 003 be rewritten as follows: >>> >>> 003 recognition-timeout: RECOGNIZE in hotword mode=20 completed without=20 >>> a match due to a recognition-timeout >>> >>> This will bring it back in line with the earlier=20 Scansoft & Nuance=20 >>> proposal (well now it just a Nuance proposal) from a=20 year and a half >>> ago: >>>=20 http://www1.ietf.org/mail-archive/web/speechsc/current/msg0 0560.html >>> >>> Andrew >>> >>> _______________________________________________ >>> Speechsc mailing list >>> Speechsc@ietf.org >>> https://www1.ietf.org/mailman/listinfo/speechsc >>> >>> > =20 =20 ----------------------------------------------------------- --------------------- =20 =20 > _______________________________________________ > Speechsc mailing list > Speechsc@ietf.org > https://www1.ietf.org/mailman/listinfo/speechsc >=20 =20 =20 _______________________________________________ Speechsc mailing list Speechsc@ietf.org https://www1.ietf.org/mailman/listinfo/speechsc =20 _______________________________________________ Speechsc mailing list Speechsc@ietf.org https://www1.ietf.org/mailman/listinfo/speechsc From speechsc-bounces@ietf.org Mon Nov 07 13:57:44 2005 Received: from localhost.cnri.reston.va.us ([127.0.0.1] helo=megatron.ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1EZCBc-0007eS-M8; Mon, 07 Nov 2005 13:57:44 -0500 Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1EZCBb-0007eI-14 for speechsc@megatron.ietf.org; Mon, 07 Nov 2005 13:57:43 -0500 Received: from ietf-mx.ietf.org (ietf-mx [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id NAA29615 for ; Mon, 7 Nov 2005 13:57:16 -0500 (EST) Received: from sj-iport-4.cisco.com ([171.68.10.86]) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1EZCRF-0004mb-0w for speechsc@ietf.org; Mon, 07 Nov 2005 14:13:54 -0500 Received: from sj-core-3.cisco.com ([171.68.223.137]) by sj-iport-4.cisco.com with ESMTP; 07 Nov 2005 10:57:32 -0800 Received: from vtg-um-e2k6.sj21ad.cisco.com (vtg-um-e2k6.cisco.com [171.70.93.77]) by sj-core-3.cisco.com (8.12.10/8.12.6) with ESMTP id jA7IvQQQ002337; Mon, 7 Nov 2005 10:57:27 -0800 (PST) X-MimeOLE: Produced By Microsoft Exchange V6.5.7226.0 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Subject: RE: [Speechsc] Vendor specific params Date: Mon, 7 Nov 2005 10:57:29 -0800 Message-ID: <03772D1EC8DE624A863058C75874A75C739ED4@vtg-um-e2k6.sj21ad.cisco.com> Thread-Topic: [Speechsc] Vendor specific params Thread-Index: AcXjoe9JxHVtc4vRQICsbn7oVB2GzQAAgdRQAAouxTA= From: "Shanmugham, Saravanan" To: "Pierre Forgues" , "RJ Auburn" , X-Spam-Score: 0.0 (/) X-Scan-Signature: b4a0a5f5992e2a4954405484e7717d8c Content-Transfer-Encoding: quoted-printable Cc: X-BeenThere: speechsc@ietf.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Speech Services Control Working Group List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: speechsc-bounces@ietf.org Errors-To: speechsc-bounces@ietf.org I don't see a problem with these headers being sent in all messages. Infact that was the intent of putting these headers in the generic section common to all resources.=20 I can make the change to remove the text that restricts it to SET-PARAMS/GET-PARAMS only. Sarvi=20 -----Original Message----- From: speechsc-bounces@ietf.org=20 [mailto:speechsc-bounces@ietf.org] On Behalf Of Pierre Forgues Sent: Monday, November 07, 2005 6:04 AM To: RJ Auburn; speechsc@ietf.org Subject: RE: [Speechsc] Vendor specific params =20 It should be possible to set vendor-specific parameters in=20 RECOGNIZE or SPEAK operations as well as SET/GET=20 operations. In my opinion, we should clarify the MRCP=20 specification regarding this. =20 Pierre =20 -----Original Message----- From: speechsc-bounces@ietf.org=20 [mailto:speechsc-bounces@ietf.org] On Behalf Of RJ Auburn Sent: Monday, November 07, 2005 8:46 AM To: speechsc@ietf.org Subject: [Speechsc] Vendor specific params =20 Folks, =20 Maybe I missed something but is there a reason that the=20 vendor specific params are only allowed on get/set-params=20 and not on other requests such as recognize? =20 Seems to be a bit of an oversight that we only allow the=20 vendor headers on get/set requests when we allow so many=20 other config prams on the other requests. =20 Thoughts? =20 RJ =20 --- RJ Auburn CTO, Voxeo Corporation tel:+1-407-418-1800 =20 =20 =20 =20 _______________________________________________ Speechsc mailing list Speechsc@ietf.org https://www1.ietf.org/mailman/listinfo/speechsc =20 =20 =20 =20 =20 _______________________________________________ Speechsc mailing list Speechsc@ietf.org https://www1.ietf.org/mailman/listinfo/speechsc =20 _______________________________________________ Speechsc mailing list Speechsc@ietf.org https://www1.ietf.org/mailman/listinfo/speechsc From speechsc-bounces@ietf.org Tue Nov 08 18:49:18 2005 Received: from localhost.cnri.reston.va.us ([127.0.0.1] helo=megatron.ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1EZdDK-00075C-50; Tue, 08 Nov 2005 18:49:18 -0500 Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1EZdDI-000754-Kl for speechsc@megatron.ietf.org; Tue, 08 Nov 2005 18:49:16 -0500 Received: from ietf-mx.ietf.org (ietf-mx [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id SAA14172 for ; Tue, 8 Nov 2005 18:48:49 -0500 (EST) Received: from mail.vocalocity.com ([38.116.10.185] helo=smtp.vocalocity.com) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1EZdTB-0004vs-KD for speechsc@ietf.org; Tue, 08 Nov 2005 19:05:42 -0500 Received: by smtp.vocalocity.com (Postfix, from userid 9999) id 888B047455; Tue, 8 Nov 2005 18:49:06 -0500 (EST) X-MimeOLE: Produced By Microsoft Exchange V6.5.7226.0 Content-class: urn:content-classes:message MIME-Version: 1.0 Date: Tue, 8 Nov 2005 18:49:04 -0500 Message-ID: <92E86BBD06161E4299A56009516472EDAD484D@gates.vcorp.vocalocity.net> Thread-Topic: test Thread-Index: AcXkvwAmKFeHWR8QSKOeSyg/D7EV2g== From: "Dan Burnett" To: X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on sandlot X-Spam-Status: No, score=-5.8 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00, HTML_90_100,HTML_MESSAGE autolearn=ham version=3.0.4 X-Spam-Score: 0.0 (/) X-Scan-Signature: bdc523f9a54890b8a30dd6fd53d5d024 Subject: [Speechsc] test X-BeenThere: speechsc@ietf.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Speech Services Control Working Group List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Content-Type: multipart/mixed; boundary="===============1718187689==" Sender: speechsc-bounces@ietf.org Errors-To: speechsc-bounces@ietf.org This is a multi-part message in MIME format. --===============1718187689== Content-class: urn:content-classes:message Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01C5E4BE.FFCCE52E" This is a multi-part message in MIME format. ------_=_NextPart_001_01C5E4BE.FFCCE52E Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable This is a test. Please ignore. =20 ------_=_NextPart_001_01C5E4BE.FFCCE52E Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable

This is a test. Please = ignore.

------_=_NextPart_001_01C5E4BE.FFCCE52E-- --===============1718187689== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Speechsc mailing list Speechsc@ietf.org https://www1.ietf.org/mailman/listinfo/speechsc --===============1718187689==-- From speechsc-bounces@ietf.org Tue Nov 08 18:51:17 2005 Received: from localhost.cnri.reston.va.us ([127.0.0.1] helo=megatron.ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1EZdFF-0007Q9-2L; Tue, 08 Nov 2005 18:51:17 -0500 Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1EZdFD-0007PS-Gp for speechsc@megatron.ietf.org; Tue, 08 Nov 2005 18:51:15 -0500 Received: from ietf-mx.ietf.org (ietf-mx [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id SAA14321 for ; Tue, 8 Nov 2005 18:50:45 -0500 (EST) Received: from mail.vocalocity.net ([38.116.10.177] helo=smtp.vocalocity.net) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1EZdV4-0004yc-CK for speechsc@ietf.org; Tue, 08 Nov 2005 19:07:38 -0500 Received: by smtp.vocalocity.net (Postfix, from userid 9999) id 32A6C16CDE3; Tue, 8 Nov 2005 18:51:03 -0500 (EST) X-MimeOLE: Produced By Microsoft Exchange V6.5.7226.0 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Date: Tue, 8 Nov 2005 18:51:00 -0500 Message-ID: <92E86BBD06161E4299A56009516472EDAD484E@gates.vcorp.vocalocity.net> Thread-Topic: recognition-timeout Thread-Index: AcXkv0Vf4IXrBzIsS5KcR8nvLoz8Sg== From: "Dan Burnett" To: X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on revelation.vcorp.vocalocity.net X-Spam-Status: No, score=-5.7 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00 autolearn=ham version=3.0.4 X-Spam-Score: 0.0 (/) X-Scan-Signature: a7d2e37451f7f22841e3b6f40c67db0f Content-Transfer-Encoding: quoted-printable Subject: [Speechsc] recognition-timeout X-BeenThere: speechsc@ietf.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Speech Services Control Working Group List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: speechsc-bounces@ietf.org Errors-To: speechsc-bounces@ietf.org Group, Group, Here is a different analysis that attempts to lay out the timelines of the various events. It is essentially a poorman's call flow diagram based almost exclusively on the header descriptions in section 9.4. I think it helps clarify the impact of Andrew's suggested change. Please let me know if I have made a mistake somewhere in this analysis. Note that I didn't include the "007 speech-too-early" case because I believe that case is clear. Hotword recognition case ------------------------ Recognizer begins listening. When speech begins no event is generated, but the recognition-timeout timer is started. If this timer expires the recognition request completes with a status code of "003 recognition-timeout". If and when an end of speech candidate is detected, the speech-complete-timeout and speech-incomplete-timeout timers are started (if necessary, the start time is backed up to the time when the speech finished). If the speech-incomplete-timeout timer expires, then a) if the speech heard so far is a valid partial match of one or more active grammars but not a complete match of any active grammar, the server rejects (what does this mean in the context of hotword recognition?) the partial result with a Completion-Cause of "partial-match". b) if the speech heard so far is a complete match of an active grammar but there exist one or more other complete matches for which the speech heard so far is a valid partial match, then this timeout also applies (what does this mean? -- does it mean the same response as in a above?) c) otherwise, continue If the speech-complete-timeout timer expires, then a) if the speech heard so far is a complete match of an active grammar and is not a valid partial match of any other grammar, then ?? (this is the implied case of recognition completing successfully) b) if the speech heard so far does not match any active grammar (what does this mean? Does it mean no partial match or are partial matches permitted?), then ?? (this is the implied case of a nomatch result) c) otherwise, continue Regular recognition case ------------------------ Recognizer begins listening. If the start-input-timers parameter is a) true (either by default or set explicitly), then the no-input-timeout timer is started. b) false, then the no-input-timeout timer is started when (and only when) a START-INPUT-TIMERS request is received by the server from the client. If the no-input-timeout timer expires, then a RECOGNITION-COMPLETE event is sent to the client with a Completion-Status of "002 no-input-timeout". If and when speech is detected the following actions occur: 1) a START-OF-INPUT event is sent to the client 2) the recognition-timeout timer is started. If this timer expires, then a) if the spoken input so far was a full match (what does this mean?), then the recognition request completes with a status code of "008 success-maxtime". b) if the spoken input so far was not a full match, then ???? (Note: action 2 here is what would change with Andrew's proposal -- it would become 'the recognition request completes with a status code of "003 recognition-timeout"', the same as in the hotword case. This would remove the success-maxtime case.) If and when an end of speech candidate is detected, the speech-complete-timeout and speech-incomplete-timeout timers are started (if necessary, the start time is backed up to the time when the speech finished). If the speech-incomplete-timeout timer expires, then a) if the speech heard so far is a valid partial match of one or more active grammars but not a complete match of any active grammar, the server rejects (what does this mean?) the partial result with a Completion-Cause of "013 partial-match". b) if the speech heard so far is a complete match of an active grammar but there exist one or more other complete matches for which the speech heard so far is a valid partial match, then this timeout also applies (what does this mean? -- does it mean the same response as in a above?) c) otherwise, continue If the speech-complete-timeout timer expires, then a) if the speech heard so far is a complete match of an active grammar and is not a valid partial match of any other grammar, then ?? (this is the implied case of recognition completing successfully) b) if the speech heard so far does not match any active grammar (what does this mean? Does it mean no partial match or are partial matches permitted?), then ?? (this is the implied case of a nomatch result) c) otherwise, continue _______________________________________________ Speechsc mailing list Speechsc@ietf.org https://www1.ietf.org/mailman/listinfo/speechsc From speechsc-bounces@ietf.org Tue Nov 08 19:18:55 2005 Received: from localhost.cnri.reston.va.us ([127.0.0.1] helo=megatron.ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1EZdfz-0002rC-0U; Tue, 08 Nov 2005 19:18:55 -0500 Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1EZdfx-0002qc-G8 for speechsc@megatron.ietf.org; Tue, 08 Nov 2005 19:18:53 -0500 Received: from ietf-mx.ietf.org (ietf-mx [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id TAA22995 for ; Tue, 8 Nov 2005 19:18:26 -0500 (EST) Received: from mail.vocalocity.net ([38.116.10.177] helo=smtp.vocalocity.net) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1EZdvq-0008Ao-2Z for speechsc@ietf.org; Tue, 08 Nov 2005 19:35:19 -0500 Received: by smtp.vocalocity.net (Postfix, from userid 9999) id B6A5E16CDE3; Tue, 8 Nov 2005 19:18:42 -0500 (EST) X-MimeOLE: Produced By Microsoft Exchange V6.5.7226.0 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Date: Tue, 8 Nov 2005 19:18:41 -0500 Message-ID: <92E86BBD06161E4299A56009516472EDAD4853@gates.vcorp.vocalocity.net> Thread-Topic: other timer-related issues Thread-Index: AcXkwyOEYfvJs7aCSGysj+hQcHTPdg== From: "Dan Burnett" To: X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on revelation.vcorp.vocalocity.net X-Spam-Status: No, score=-5.7 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00 autolearn=ham version=3.0.4 X-Spam-Score: 0.0 (/) X-Scan-Signature: 02ec665d00de228c50c93ed6b5e4fc1a Content-Transfer-Encoding: quoted-printable Subject: [Speechsc] other timer-related issues X-BeenThere: speechsc@ietf.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Speech Services Control Working Group List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: speechsc-bounces@ietf.org Errors-To: speechsc-bounces@ietf.org Group, I had several other thoughts/comments not directly related to the recognition-timeout issue. 1) The definitions of "complete match", "full match", and "partial match" should be clearer, eg., - Full match =3D Complete match =3D "a sequence of recognized tokens = that constitutes a valid complete utterance according to the grammar" - N-token prefix =3D "the first N tokens of a complete match" - Partial match =3D "any N-token prefix" In addition, to simplify the timer text we could add the notion of - Maximal match =3D "a complete match that is not a partial match of any other complete match" 2) As I was writing out the analysis in my earlier email (while re-reading the draft), it was not immediately apparent in the draft what the behavior was in all cases, hence the "??" in several places. In a number of these cases one can implicitly assume the behavior to result in a "000 success" or "001 no-match", but I couldn't find it explicitly stated. Did I miss this, or do we need to add specific text? 3) It was also not immediately apparent that all the ways of successfully navigating past the speech-complete-timeout and speech-incomplete-timeout timers are addressed. I don't believe there are any restrictions on the order in which these two timers can expire, so what about the following cases (where a certain situation exists when the first timer expires and another exists when the second expires)? SIT -> (then) SCT no partial match (only possibility is no partial match) -- handled complete (no other partial) (only possibility is complete) -- handled SCT -> (then) SIT partial match complete -- not handled non-maximal complete match nomatch -- not handled? non-maximal complete match complete -- not handled Thoughts, anyone? Interestingly, 9.4.16 says that the speech-incomplete-timeout is usually longer than the speech-complete-timeout, which would lead to the second group of cases here (SIT after SCT). Again, if I've missed something, I'm sure someone will graciously correct me :) -- dan _______________________________________________ Speechsc mailing list Speechsc@ietf.org https://www1.ietf.org/mailman/listinfo/speechsc From speechsc-bounces@ietf.org Wed Nov 09 16:26:13 2005 Received: from localhost.cnri.reston.va.us ([127.0.0.1] helo=megatron.ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1EZxSP-0001T9-9e; Wed, 09 Nov 2005 16:26:13 -0500 Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1EZxSN-0001SY-FO for speechsc@megatron.ietf.org; Wed, 09 Nov 2005 16:26:11 -0500 Received: from ietf-mx.ietf.org (ietf-mx [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id QAA10687 for ; Wed, 9 Nov 2005 16:25:42 -0500 (EST) Received: from [195.222.227.20] (helo=gromit.ibp.de ident=Debian-exim) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1EZxiR-0002UM-NI for speechsc@ietf.org; Wed, 09 Nov 2005 16:42:48 -0500 Received: from [195.222.227.22] (helo=[195.222.227.22]) by gromit.ibp.de with esmtp (Exim 4.50) id 1EZxAr-0003t0-M3; Wed, 09 Nov 2005 22:08:06 +0100 Message-ID: <4372694C.3030702@ibp.de> Date: Wed, 09 Nov 2005 22:25:32 +0100 From: Claudia Daboul User-Agent: Mozilla Thunderbird 0.8 (Windows/20040913) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Dan Burnett Subject: Re: [Speechsc] other timer-related issues References: <92E86BBD06161E4299A56009516472EDAD4853@gates.vcorp.vocalocity.net> In-Reply-To: <92E86BBD06161E4299A56009516472EDAD4853@gates.vcorp.vocalocity.net> X-Enigmail-Version: 0.86.1.0 X-Enigmail-Supports: pgp-inline, pgp-mime Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Score: 0.0 (/) X-Spam-Score: 0.0 (/) X-Scan-Signature: 2e8fc473f5174be667965460bd5288ba Content-Transfer-Encoding: 7bit Cc: speechsc@ietf.org X-BeenThere: speechsc@ietf.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Speech Services Control Working Group List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: speechsc-bounces@ietf.org Errors-To: speechsc-bounces@ietf.org Maybe it's useful to look at some examples to clarify the meaning of the various recognizer timeouts and events. I will explain my understanding of the terms that where discussed lately on some examples and we can see if everybody agrees at least on these examples. Consider a grammar that allows ordering drinks from a menu. It would allow utterances like: "A beer, please" or "Two coffee, a beer and two coke and an apple juice." Consider the following sample utterances: 1) "Two coffee and a beer." Complete, but not maximal match. (I use the term "maximal match" introduced by Dan.) These utterances would result in "000 success" after "speech-incomplete-timeout" has expired. 2) "Two coffee and a beer, please". A maximal match, would result in "000 success" after "speech-complete-timeout" has expired (the reaction would usually be a bit faster than in case 1, since "speech-complete-timeout < speech-incomplete-timeout") 3) "Two coffee, a beer." A partial, but not complete match, would result in "013 partial-match" after "speech-incomplete-timeout" has expired. Note that some recognizers allow setting different timeouts for cases 1 and 3. It would be something like "speech-maybecomplete-timeout", such that: speech-complete-timeout < speech-maybecomplete-timeout < speech-incomplete-timeout This intermediate timeout would then apply in case 1. 4) "Two coffee, a squishy and a beer." A no-match, because the word "squishy" is not in the grammar. This would usually result in a "001 nomatch" immediately after the end of the word "squishy" is detected. The recognizer would not have to wait for the "speech-incomplete-timeout" to fire. But this case depends on the implementation. The above behaviour could only be achieved, when speech is processed by the recognizer at realtime. Some implementations separate endpointing and speech recognition. Then utterances are only processed by the recognizer after the endpointing process has determined a complete utterance. In that setup the nomatch-event would come only after the word "beer" plus the "speech-incomplete-timeout" and the "speech-complete-timeout" and "speech-incomplete-timeout" could not be distinguished. I'm still not completely sure about the behaviour, when the utterance is cut off by the "recognition-timeout". For example consider the case where an utterance is cut off in the middle of the last word like this: 5) "Two coffee, a beer, a coke and two tea and a glass of wine and an oran" What should be the return code in this case: "001 nomatch", "008 success-maxtime" or "013 partial-match"? Would it make a difference if the utterance was cut off a bit earlier as in 5') "Two coffee, a beer, a coke and two tea and a glass of wine and an" or 5'') "Two coffee, a beer, a coke and two tea and a glass of wine" ? I think it should be "008" in all these cases. Andrew's suggestion I interpreted as follows: The cut off utterance 6) "Two coffee, a squishy, a coke and two tea and a glass of wine and an" which contains the unknown word "squishy" should return "003" instead of the usual "001". Did I get this right, Andrew? Again this would only apply in the setup where endpointing must be completed before the actual recognition starts. An application could distinguish this case from the usual no-match and have different prompts like: A) "I didn't completey understand your order. Please make sure to choose only items from our menu." (for 001) B) "Your order is getting too long and I didn't understand everything you said. Let us go through your order item by item and please make sure to order only items from our menu. What was the first item again?" (for 003) I'm not sure, if it's really necessary to have this distinction, though. I think it will rarely be used by applications. Anyway I hope these examples can help a bit to clarify the terminology. Does anybody have examples for the timeouts in hotword-mode? Claudia Dan Burnett wrote: > Group, > > I had several other thoughts/comments not directly related to the > recognition-timeout issue. > > 1) The definitions of "complete match", "full match", and "partial > match" should be clearer, eg., > - Full match = Complete match = "a sequence of recognized tokens that > constitutes a > valid complete utterance according to the grammar" > - N-token prefix = "the first N tokens of a complete match" > - Partial match = "any N-token prefix" > > In addition, to simplify the timer text we could add the notion of > - Maximal match = "a complete match that is not a partial match of > any > other complete match" > > > 2) As I was writing out the analysis in my earlier email (while > re-reading the draft), it was not immediately > apparent in the draft what the behavior was in all cases, hence the > "??" in several places. In a number of these cases one can > implicitly > assume the behavior to result in a "000 success" or "001 no-match", > but I couldn't find it explicitly stated. Did I miss this, or do > we need to add specific text? > > 3) It was also not immediately apparent that all the ways of > successfully navigating past the speech-complete-timeout and > speech-incomplete-timeout timers are addressed. I don't believe > there are any restrictions on the order in which these two timers > can expire, so what about the following cases (where a certain > situation exists when the first timer expires and another exists > when the second expires)? > > SIT -> (then) SCT > no partial match (only possibility is no partial match) > -- handled > complete (no other partial) (only possibility is complete) -- > handled > > > SCT -> (then) SIT > partial match complete -- not handled > non-maximal complete match nomatch -- not handled? > non-maximal complete match complete -- not handled > > > Thoughts, anyone? > > Interestingly, 9.4.16 says that the speech-incomplete-timeout is > usually longer than the speech-complete-timeout, which would lead to > the second group of cases here (SIT after SCT). > > > Again, if I've missed something, I'm sure someone will graciously > correct me :) > > -- dan > > _______________________________________________ > Speechsc mailing list > Speechsc@ietf.org > https://www1.ietf.org/mailman/listinfo/speechsc _______________________________________________ Speechsc mailing list Speechsc@ietf.org https://www1.ietf.org/mailman/listinfo/speechsc From speechsc-bounces@ietf.org Thu Nov 10 11:20:22 2005 Received: from localhost.cnri.reston.va.us ([127.0.0.1] helo=megatron.ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1EaF9y-0002Aw-2O; Thu, 10 Nov 2005 11:20:22 -0500 Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1EaF9w-0002Ar-Il for speechsc@megatron.ietf.org; Thu, 10 Nov 2005 11:20:20 -0500 Received: from ietf-mx.ietf.org (ietf-mx [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id LAA14166 for ; Thu, 10 Nov 2005 11:19:51 -0500 (EST) Received: from mxgate1.brooktrout.com ([204.176.74.10]) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1EaFQA-00073Q-QJ for speechsc@ietf.org; Thu, 10 Nov 2005 11:37:08 -0500 X-IronPort-AV: i="3.97,314,1125892800"; d="scan'208"; a="22538763:sNHT27255584" Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable X-MimeOLE: Produced By Microsoft Exchange V6.5.7226.0 Date: Thu, 10 Nov 2005 11:19:01 -0500 Message-ID: <330A23D8336C0346B5C1A5BB19666647018F7C6F@ATLANTIS.Brooktrout.com> Thread-Topic: Meeting Info Thread-Index: AcXmEnYmz+hJTvEXSNyiIOXX4PG7fg== From: "Burger, Eric" To: X-Spam-Score: 0.0 (/) X-Scan-Signature: 68c8cc8a64a9d0402e43b8eee9fc4199 Content-Transfer-Encoding: quoted-printable Subject: [Speechsc] Meeting Info X-BeenThere: speechsc@ietf.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Speech Services Control Working Group List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: speechsc-bounces@ietf.org Errors-To: speechsc-bounces@ietf.org The Materials are on the meeting web site: https://onsite.ietf.org/public/meeting_materials.cgi?meeting_num=3D64 Scroll down to speechsc. The agenda and slides are there. Audio (sometimes spotty) is at: http://videolab.uoregon.edu/events/ietf/ietf645.m3u Jabber: Server: ietf.xmpp.org Room: speechsc _______________________________________________ Speechsc mailing list Speechsc@ietf.org https://www1.ietf.org/mailman/listinfo/speechsc From speechsc-bounces@ietf.org Thu Nov 10 11:27:53 2005 Received: from localhost.cnri.reston.va.us ([127.0.0.1] helo=megatron.ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1EaFHE-0002tV-V8; Thu, 10 Nov 2005 11:27:52 -0500 Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1EaFHD-0002tJ-Bx for speechsc@megatron.ietf.org; Thu, 10 Nov 2005 11:27:51 -0500 Received: from ietf-mx.ietf.org (ietf-mx [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id LAA14552 for ; Thu, 10 Nov 2005 11:27:22 -0500 (EST) Received: from web33109.mail.mud.yahoo.com ([68.142.206.90]) by ietf-mx.ietf.org with smtp (Exim 4.43) id 1EaFXS-0007DA-Di for speechsc@ietf.org; Thu, 10 Nov 2005 11:44:39 -0500 Received: (qmail 51411 invoked by uid 60001); 10 Nov 2005 16:27:41 -0000 DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=Message-ID:Received:Date:From:Subject:To:In-Reply-To:MIME-Version:Content-Type:Content-Transfer-Encoding; b=ipAsUoQ8OJVBERzso+CbAccum92v+BViFOOv13pl3YSL+ID+Z4aFQUXMDYsya5+tBM1OCgtTbHGLoyNsdogEPu44IUU6vZ+aXXsXahlkQ873iMa6KSyVuF604uFOGzwpYV5dHDKdaCn+wOzPL4Fnq6LlnjYKdcQevKvihpJ6+os= ; Message-ID: <20051110162741.51409.qmail@web33109.mail.mud.yahoo.com> Received: from [206.47.9.3] by web33109.mail.mud.yahoo.com via HTTP; Thu, 10 Nov 2005 08:27:41 PST Date: Thu, 10 Nov 2005 08:27:41 -0800 (PST) From: Dan Burnett Subject: Re: [Speechsc] other timer-related issues To: Dan Burnett , speechsc@ietf.org In-Reply-To: <92E86BBD06161E4299A56009516472EDAD4853@gates.vcorp.vocalocity.net> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-Spam-Score: 3.5 (+++) X-Scan-Signature: 5ebbf074524e58e662bc8209a6235027 Content-Transfer-Encoding: 8bit Cc: X-BeenThere: speechsc@ietf.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Speech Services Control Working Group List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: speechsc-bounces@ietf.org Errors-To: speechsc-bounces@ietf.org Someone pointed out to me offline that I left something out of the definition of "partial match". I had originally described the prefix as being a proper prefix (i.e. not including the entire complete match token string) but somehow dropped it. The definition of a partial match should be "any N-token prefix that is not itself a complete match". -- dan --- Dan Burnett wrote: > Group, > > I had several other thoughts/comments not directly > related to the > recognition-timeout issue. > > 1) The definitions of "complete match", "full > match", and "partial > match" should be clearer, eg., > - Full match = Complete match = "a sequence of > recognized tokens that > constitutes a > valid complete utterance according to the > grammar" > - N-token prefix = "the first N tokens of a > complete match" > - Partial match = "any N-token prefix" > > In addition, to simplify the timer text we could > add the notion of > - Maximal match = "a complete match that is not a > partial match of > any > other complete match" > > > 2) As I was writing out the analysis in my earlier > email (while > re-reading the draft), it was not immediately > apparent in the draft what the behavior was in > all cases, hence the > "??" in several places. In a number of these > cases one can > implicitly > assume the behavior to result in a "000 success" > or "001 no-match", > but I couldn't find it explicitly stated. Did I > miss this, or do > we need to add specific text? > > 3) It was also not immediately apparent that all the > ways of > successfully navigating past the > speech-complete-timeout and > speech-incomplete-timeout timers are addressed. > I don't believe > there are any restrictions on the order in which > these two timers > can expire, so what about the following cases > (where a certain > situation exists when the first timer expires and > another exists > when the second expires)? > > SIT -> (then) SCT > no partial match (only possibility is > no partial match) > -- handled > complete (no other partial) (only possibility is > complete) -- > handled > > > SCT -> (then) SIT > partial match complete -- not > handled > non-maximal complete match nomatch -- not > handled? > non-maximal complete match complete -- not > handled > > > Thoughts, anyone? > > Interestingly, 9.4.16 says that the > speech-incomplete-timeout is > usually longer than the speech-complete-timeout, > which would lead to > the second group of cases here (SIT after SCT). > > > Again, if I've missed something, I'm sure someone > will graciously > correct me :) > > -- dan > > _______________________________________________ > Speechsc mailing list > Speechsc@ietf.org > https://www1.ietf.org/mailman/listinfo/speechsc > __________________________________ Yahoo! Mail - PC Magazine Editors' Choice 2005 http://mail.yahoo.com _______________________________________________ Speechsc mailing list Speechsc@ietf.org https://www1.ietf.org/mailman/listinfo/speechsc From speechsc-bounces@ietf.org Thu Nov 10 12:43:33 2005 Received: from localhost.cnri.reston.va.us ([127.0.0.1] helo=megatron.ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1EaGST-0004c0-6c; Thu, 10 Nov 2005 12:43:33 -0500 Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1EaGSR-0004at-2s for speechsc@megatron.ietf.org; Thu, 10 Nov 2005 12:43:31 -0500 Received: from ietf-mx.ietf.org (ietf-mx [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id MAA20989 for ; Thu, 10 Nov 2005 12:43:01 -0500 (EST) Received: from mail.voicegenie.com ([205.150.90.87] helo=voicegenie.com) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1EaGig-0001bJ-E3 for speechsc@ietf.org; Thu, 10 Nov 2005 13:00:19 -0500 Received: from [205.150.90.65] (parrot.voicegenie.com [205.150.90.65]) by voicegenie.com (8.11.6+Sun/8.9.3) with ESMTP id jAAHgZR04524; Thu, 10 Nov 2005 12:42:35 -0500 (EST) Message-ID: <4373868B.1030309@voicegenie.com> Date: Thu, 10 Nov 2005 12:42:35 -0500 From: Andrew Wahbe Organization: VoiceGenie Technologies User-Agent: Mozilla Thunderbird 1.0.2 (Windows/20050317) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Claudia Daboul Subject: Re: [Speechsc] other timer-related issues References: <92E86BBD06161E4299A56009516472EDAD4853@gates.vcorp.vocalocity.net> <4372694C.3030702@ibp.de> In-Reply-To: <4372694C.3030702@ibp.de> Content-Type: multipart/mixed; boundary="------------060606030808020008060500" X-Spam-Score: 0.0 (/) X-Scan-Signature: f1405b5eaa25d745f8c52e3273d3af78 Cc: speechsc@ietf.org X-BeenThere: speechsc@ietf.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Speech Services Control Working Group List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: speechsc-bounces@ietf.org Errors-To: speechsc-bounces@ietf.org This is a multi-part message in MIME format. --------------060606030808020008060500 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit I think I can clarify when I think 003 would be returned. The most obvious example I can think of is actually from experience -- the caller didn't hang up the phone properly and put it down next to their TV. We just happened to be recording the utterances for tuning. What we got was a series of maxspeech timeouts (in VoiceXML speak) where each utterance was just noise (that included bits of speech). This would clearly result in an MRCP Recognition completion cause of 003. For other cases I don't think its as clear cut (according to the current spec -- even with my proposed modification). However, in my opinion (after rethinking the issues a little), if the recognition-timeout fires either 003 or 008 is returned. 008 is only returned if the utterance is a complete match and, more importantly, there is a valid result to be processed. For me that is the key question: Is there, or is there not a valid result that the app can process? If there is, the result is 008; otherwise, it's 003. Determining where the timer fired in the utterance and if the user was cut off is not really important. Additionally, I don't think its that reliable. In reality, one frequently gets a low confidence match rather than a nomatch from the recognizer. If you aren't getting a clear distinction between matches and no-matches, throwing partial-matches into the mix doesn't seem worthwhile. Also, what are you going to do when you have a partial match? Can you process an attached result? I don't think so -- remember that the semantic interpretation rules need to run on a matched token sequence. It is quite likely that the logic in those rules assume full matches. If you run the rules on a partial match you run the risk of generating some weird semantic interpretation result that can't be processed. So I really question the usefulness of the partial-match result -- perhaps its worth changing this back to no-match (as it was in the 06 draft). I can't recall the reason for introducing partial-match either; anyone care to clarify? Anyways, in my opinion the recognition timeout examples you gave should, strictly speaking have the following completion cause codes: 5) 003 5') 003 5'') 008 6) 003 This is because 5'' is the only complete match. However, practically speaking you will probably get 008 for all versions of 5) since its quite likely that either "Two coffee, a beer, a coke and two tea and a glass of wine and an orange" or "Two coffee, a beer, a coke and two tea and a glass of wine" is matched with a reasonable confidence level. For 6) you might still get 003 as long as "squishy" wasn't mistaken for something else in the grammar. For your examples 1-4, I agree with your analysis (though I again question the usefulness of partial match -- I think that strictly speaking 3 should be a no-match, but practically speaking you would probably get a match for something like "Two coffee, and a beer" with perhaps a lower but reasonable confidence). Another comment: I don't think I've ever seen a recognizer return in the middle of an utterance because it hit a token that didn't match the grammar. ie. if you said "Two coffee, a squishy and a beer." and squishy wasn't in the grammar, the system would still wait until you finished speaking. This could just be because this token is matching something else in the grammar with a low confidence or the endpointing is taking place before hand as you mentioned. I'm not sure... and it probably depends on the recognizer you are using. I would find that behavior a bit strange though... wouldn't you want to always try and wait for the user to finish speaking? Otherwise, wouldn't you be sentencing yourself to a speech-too-early result after each no-match result? Anyways, its not a big deal -- I'm not proposing we change the definitions of no-match or the timeouts in this regard. I think it does reinforce my point that focusing on complete match vs partial match vs no match of the token string is perhaps the wrong approach when generating the completion-cause code. I think its better to discern: I have a result for you to process vs I don't have a valid result. IMO, this line of reasoning gives you no-input (no input at all) no-match (I got input but no valid result) match (I got input and a valid result) recognition-timeout (I got input, but it was too long and I didn't get a valid result) success-maxtime (I got input, and it was too long but what I did get seemed to be a valid result). (along with speech-too-early and all the error results). I think these are basically the codes in the 06 draft with the distinction between 003 and 008 (the original concern that precipitated the changes) clarified. This is just what makes sense to me, at least for normal recognition mode. I'd need to think about hotword recognition some more before commenting. Finally on your comment of the usefulness of discerning 001 from 003: I can understand your point of view, but the fact of the matter is that a VoiceXML browser requires that this distinction be made. I think it makes a lot of sense to cater to this need since VoiceXML browsers will probably one of the most (if not the most) common forms of MRCP v2 client. Andrew Claudia Daboul wrote: > Maybe it's useful to look at some examples to clarify the meaning of > the various recognizer timeouts and events. > > I will explain my understanding of the terms that where discussed > lately on some examples and we can see if everybody agrees at least on > these examples. > > Consider a grammar that allows ordering drinks from a menu. It would > allow utterances like: "A beer, please" or "Two coffee, a beer and two > coke and an apple juice." > > Consider the following sample utterances: > > 1) "Two coffee and a beer." > Complete, but not maximal match. (I use the term "maximal match" > introduced by Dan.) > These utterances would result in "000 success" after > "speech-incomplete-timeout" has expired. > > 2) "Two coffee and a beer, please". > A maximal match, would result in "000 success" after > "speech-complete-timeout" has expired > (the reaction would usually be a bit faster than in case 1, since > "speech-complete-timeout < speech-incomplete-timeout") > > 3) "Two coffee, a beer." > A partial, but not complete match, > would result in "013 partial-match" after "speech-incomplete-timeout" > has expired. > > Note that some recognizers allow setting different timeouts for cases > 1 and 3. It would be something like "speech-maybecomplete-timeout", > such that: > speech-complete-timeout < speech-maybecomplete-timeout < > speech-incomplete-timeout > This intermediate timeout would then apply in case 1. > > 4) "Two coffee, a squishy and a beer." > A no-match, because the word "squishy" is not in the grammar. This > would usually result in a "001 nomatch" immediately after the end of > the word "squishy" is detected. The recognizer would not have to wait > for the "speech-incomplete-timeout" to fire. > But this case depends on the implementation. The above behaviour could > only be achieved, when speech is processed by the recognizer at > realtime. Some implementations separate endpointing and speech > recognition. Then utterances are only processed by the recognizer > after the endpointing process has determined a complete utterance. In > that setup the nomatch-event would come only after the word "beer" > plus the "speech-incomplete-timeout" and the "speech-complete-timeout" > and "speech-incomplete-timeout" could not be distinguished. > > I'm still not completely sure about the behaviour, when the utterance > is cut off by the "recognition-timeout". For example consider the > case where an utterance is cut off in the middle of the last word like > this: > > 5) "Two coffee, a beer, a coke and two tea and a glass of wine and an > oran" > > What should be the return code in this case: "001 nomatch", "008 > success-maxtime" or "013 partial-match"? > > Would it make a difference if the utterance was cut off a bit earlier > as in > 5') "Two coffee, a beer, a coke and two tea and a glass of wine and an" > or > 5'') "Two coffee, a beer, a coke and two tea and a glass of wine" > ? > I think it should be "008" in all these cases. > > Andrew's suggestion I interpreted as follows: The cut off utterance > 6) "Two coffee, a squishy, a coke and two tea and a glass of wine and an" > which contains the unknown word "squishy" should return "003" instead > of the usual "001". Did I get this right, Andrew? > Again this would only apply in the setup where endpointing must be > completed before the actual recognition starts. > > An application could distinguish this case from the usual no-match and > have different prompts like: > A) "I didn't completey understand your order. Please make sure to > choose only items from our menu." (for 001) > B) "Your order is getting too long and I didn't understand everything > you said. Let us go through your order item by item and please make > sure to order only items from our menu. What was the first item > again?" (for 003) > I'm not sure, if it's really necessary to have this distinction, > though. I think it will rarely be used by applications. > > Anyway I hope these examples can help a bit to clarify the > terminology. Does anybody have examples for the timeouts in hotword-mode? > > Claudia > > > > > Dan Burnett wrote: > >> Group, >> >> I had several other thoughts/comments not directly related to the >> recognition-timeout issue. >> >> 1) The definitions of "complete match", "full match", and "partial >> match" should be clearer, eg., >> - Full match = Complete match = "a sequence of recognized tokens that >> constitutes a >> valid complete utterance according to the grammar" >> - N-token prefix = "the first N tokens of a complete match" >> - Partial match = "any N-token prefix" >> >> In addition, to simplify the timer text we could add the notion of >> - Maximal match = "a complete match that is not a partial match of >> any >> other complete match" >> >> >> 2) As I was writing out the analysis in my earlier email (while >> re-reading the draft), it was not immediately >> apparent in the draft what the behavior was in all cases, hence the >> "??" in several places. In a number of these cases one can >> implicitly >> assume the behavior to result in a "000 success" or "001 no-match", >> but I couldn't find it explicitly stated. Did I miss this, or do >> we need to add specific text? >> >> 3) It was also not immediately apparent that all the ways of >> successfully navigating past the speech-complete-timeout and >> speech-incomplete-timeout timers are addressed. I don't believe >> there are any restrictions on the order in which these two timers >> can expire, so what about the following cases (where a certain >> situation exists when the first timer expires and another exists >> when the second expires)? >> >> SIT -> (then) SCT >> no partial match (only possibility is no partial match) >> -- handled >> complete (no other partial) (only possibility is complete) -- >> handled >> >> >> SCT -> (then) SIT >> partial match complete -- not handled >> non-maximal complete match nomatch -- not handled? >> non-maximal complete match complete -- not handled >> >> >> Thoughts, anyone? >> >> Interestingly, 9.4.16 says that the speech-incomplete-timeout is >> usually longer than the speech-complete-timeout, which would lead to >> the second group of cases here (SIT after SCT). >> >> >> Again, if I've missed something, I'm sure someone will graciously >> correct me :) >> >> -- dan >> >> _______________________________________________ >> Speechsc mailing list >> Speechsc@ietf.org >> https://www1.ietf.org/mailman/listinfo/speechsc > > > > > _______________________________________________ > Speechsc mailing list > Speechsc@ietf.org > https://www1.ietf.org/mailman/listinfo/speechsc > > --------------060606030808020008060500 Content-Type: text/x-vcard; charset=utf-8; name="awahbe.vcf" Content-Disposition: attachment; filename="awahbe.vcf" Content-Transfer-Encoding: 7bit begin:vcard fn:Andrew Wahbe n:Wahbe;Andrew org:VoiceGenie Technologies INC.;Multimodal and Development Tools adr:8th Floor;;1120 Finch Avenue W.;Toronto;ON;M3J 3H7;Canada email;internet:awahbe@voicegenie.com title:Technical Manager tel;work:(416) 736-0905 ext. 258 tel;fax:(416) 736-1551 x-mozilla-html:TRUE url:http://www.voicegenie.com version:2.1 end:vcard --------------060606030808020008060500 Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Speechsc mailing list Speechsc@ietf.org https://www1.ietf.org/mailman/listinfo/speechsc --------------060606030808020008060500-- From speechsc-bounces@ietf.org Thu Nov 10 14:01:03 2005 Received: from localhost.cnri.reston.va.us ([127.0.0.1] helo=megatron.ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1EaHfT-0002U6-48; Thu, 10 Nov 2005 14:01:03 -0500 Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1EaHfR-0002TG-HA for speechsc@megatron.ietf.org; Thu, 10 Nov 2005 14:01:01 -0500 Received: from ietf-mx.ietf.org (ietf-mx [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id OAA26053 for ; Thu, 10 Nov 2005 14:00:33 -0500 (EST) Received: from web33104.mail.mud.yahoo.com ([68.142.206.85]) by ietf-mx.ietf.org with smtp (Exim 4.43) id 1EaHvh-0003ye-6X for speechsc@ietf.org; Thu, 10 Nov 2005 14:17:50 -0500 Received: (qmail 28286 invoked by uid 60001); 10 Nov 2005 19:00:51 -0000 DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=Message-ID:Received:Date:From:Subject:To:In-Reply-To:MIME-Version:Content-Type:Content-Transfer-Encoding; b=1ZucxW7/Sqz4b3Mma2lNd9Ki1MtbpMCjMWeJMaQ4pLywDqAl1TqUNf0hJMJoF02B4ivavYhBDJkAi/HfvMWIZx7k+2INT3s/q8sosKNndOR3V6+1OUjEdCR3AVcfxTAB3MoUOQOMpEV/SGjocuk166I5DZjs5vtj+5QaOytdXe4= ; Message-ID: <20051110190051.28284.qmail@web33104.mail.mud.yahoo.com> Received: from [209.52.108.206] by web33104.mail.mud.yahoo.com via HTTP; Thu, 10 Nov 2005 11:00:50 PST Date: Thu, 10 Nov 2005 11:00:50 -0800 (PST) From: Dan Burnett Subject: Re: [Speechsc] other timer-related issues To: speechsc@ietf.org In-Reply-To: <4373868B.1030309@voicegenie.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-Spam-Score: 3.6 (+++) X-Scan-Signature: 10ba05e7e8a9aa6adb025f426bef3a30 Content-Transfer-Encoding: 8bit X-BeenThere: speechsc@ietf.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Speech Services Control Working Group List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: speechsc-bounces@ietf.org Errors-To: speechsc-bounces@ietf.org I agree with Andrew's observations, statements and analysis below, and I believe the conditions (and their definitions) are what was originally intended before the concept of partial matching entered the document. I would prefer to have the return conditions as you've specified them. I believe they would work as follows for hotword: no-input (no input at all) match (I got input and a valid result) success-maxtime (I got input, and it was too long but what I did get seemed to be a valid result). The no-match and recognition-timeout cases don't matter for hotword (at least for VoiceXML). -- dan --- Andrew Wahbe wrote: ... > Another comment: I don't think I've ever seen a > recognizer return in the > middle of an utterance because it hit a token that > didn't match the > grammar. ie. if you said "Two coffee, a squishy and > a beer." and squishy > wasn't in the grammar, the system would still wait > until you finished > speaking. This could just be because this token is > matching something > else in the grammar with a low confidence or the > endpointing is taking > place before hand as you mentioned. I'm not sure... > and it probably > depends on the recognizer you are using. > > I would find that behavior a bit strange though... > wouldn't you want to > always try and wait for the user to finish speaking? > Otherwise, wouldn't > you be sentencing yourself to a speech-too-early > result after each > no-match result? Anyways, its not a big deal -- I'm > not proposing we > change the definitions of no-match or the timeouts > in this regard. I > think it does reinforce my point that focusing on > complete match vs > partial match vs no match of the token string is > perhaps the wrong > approach when generating the completion-cause code. > I think its better > to discern: I have a result for you to process vs I > don't have a valid > result. > > IMO, this line of reasoning gives you > no-input (no input at all) > no-match (I got input but no valid result) > match (I got input and a valid result) > recognition-timeout (I got input, but it was too > long and I didn't get a > valid result) > success-maxtime (I got input, and it was too long > but what I did get > seemed to be a valid result). > (along with speech-too-early and all the error > results). > > I think these are basically the codes in the 06 > draft with the > distinction between 003 and 008 (the original > concern that precipitated > the changes) clarified. This is just what makes > sense to me, at least > for normal recognition mode. I'd need to think about > hotword recognition > some more before commenting. ... __________________________________ Yahoo! Mail - PC Magazine Editors' Choice 2005 http://mail.yahoo.com _______________________________________________ Speechsc mailing list Speechsc@ietf.org https://www1.ietf.org/mailman/listinfo/speechsc From speechsc-bounces@ietf.org Thu Nov 10 15:51:12 2005 Received: from localhost.cnri.reston.va.us ([127.0.0.1] helo=megatron.ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1EaJO4-0007NG-AU; Thu, 10 Nov 2005 15:51:12 -0500 Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1EaJO1-0007N3-Po for speechsc@megatron.ietf.org; Thu, 10 Nov 2005 15:51:11 -0500 Received: from ietf-mx.ietf.org (ietf-mx [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id PAA03734 for ; Thu, 10 Nov 2005 15:50:41 -0500 (EST) Received: from sj-iport-4.cisco.com ([171.68.10.86]) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1EaJeJ-0008Fs-2L for speechsc@ietf.org; Thu, 10 Nov 2005 16:07:59 -0500 Received: from sj-core-4.cisco.com ([171.68.223.138]) by sj-iport-4.cisco.com with ESMTP; 10 Nov 2005 12:50:59 -0800 Received: from vtg-um-e2k6.sj21ad.cisco.com (vtg-um-e2k6.cisco.com [171.70.93.77]) by sj-core-4.cisco.com (8.12.10/8.12.6) with ESMTP id jAAKovag022156; Thu, 10 Nov 2005 12:50:57 -0800 (PST) X-MimeOLE: Produced By Microsoft Exchange V6.5.7226.0 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Subject: RE: [speechsc] VoiceXML maxspeech timeout not implementable withMRCPv2 Date: Thu, 10 Nov 2005 12:50:55 -0800 Message-ID: <03772D1EC8DE624A863058C75874A75C73A327@vtg-um-e2k6.sj21ad.cisco.com> Thread-Topic: [speechsc] VoiceXML maxspeech timeout not implementable withMRCPv2 Thread-Index: AcXjv+z4t3Gy6OPdT+SBCfRZty8yUgADA4mAAJoDe9A= From: "Shanmugham, Saravanan" To: "Shanmugham, Saravanan" , "Dave Burke" , "Andrew Wahbe" , "IETF SPEECHSC $E-mail$" X-Spam-Score: 0.0 (/) X-Scan-Signature: 918f4bd8440e8de4700bcf6d658bc801 Content-Transfer-Encoding: quoted-printable Cc: X-BeenThere: speechsc@ietf.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Speech Services Control Working Group List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: speechsc-bounces@ietf.org Errors-To: speechsc-bounces@ietf.org I just re-read this whole description and came to a different conlcusion. When a recongizer is started and the speaker starts speaking. The match conditions for the spoken speech and grammar matching upto that point can be one of the following no-match partial-match complete-match You could go one step further and say a complete-match can be of 2 sub-classes on further grammar possibilities(Is there value to doing this. My thinking is not, Any thoughts) complete-match-with-more-possibilities Complete-match-with-no-more-possibilities Theoretically speaking, no-match could be decided the moment that the speaker speaks something that does not match anything, which could be even the first word. But, if I understand right, most commercial recognizers today don't match the speech as they are spoken, but rather wait for end-pointing/some-silience or a timer before doing the matching. NOW, walking through what happens during a non-hotword RECOGNIZE operation. The Recognition-Timer is started right at the beginning of the RECOGNIZE operation. When such a silence happens they take the speech collected upto that point, and match it with the grammar and do the following. 1. If there was NO match - they generate "no-match 001" completion cause code. 2. If there was partial-match - They start the Speech-Incomplete-Timer 3. If there was complete-match(both sub types) - They start the Speech-Complete-Timer. Now lets look at the cases of the different timers expiring. Speech-Incomplete-Timer expires, we return partial-match 013 When Speech-Complete-Timer expires, we return "success" 000. When Recognition-Timer expires we have multiple possibilities based on the grammar match condition. 1. no-match 001 2. partial-match 013 3. succcess (which is differentiated as success-maxtime 008) So I don't see what we are missing here. Now theoretically, we could create new completion cause code no-match-max-time, and partial-match-maxtime to differentiate them from the ones that would have been generated during a silence/endpointing. Now If I understand Andrew, he is refering to no-match-max-time as Recognition-Timeout, which is fine by me, but question if it is actually necessary for the differeniation. Coz, I am thinking if we want to different no-match-maxtime(andrew's proposed recongition-timeout), and success-maxtime(already defined), then we should also define partial-match-maxtime, which has a higher probablity of occuring than success-maxtime :-)) What do you think, Sarvi=20 -----Original Message----- From: speechsc-bounces@ietf.org=20 [mailto:speechsc-bounces@ietf.org] On Behalf Of=20 Shanmugham, Saravanan Sent: Monday, November 07, 2005 10:54 AM To: Dave Burke; Andrew Wahbe; IETF SPEECHSC (E-mail) Subject: RE: [speechsc] VoiceXML maxspeech timeout not=20 implementable withMRCPv2 =20 I understand and see his point. I think what he is=20 proposing is what it used to be before it was changed=20 based on previous feed back. Hence I want to be carefull=20 before making this change.=20 =20 I have this as the only Open Issue in my slides for the=20 SpeechSC meeting. If there is concerns raised this is a=20 simple enough change to get into -09. =20 Sarvi=20 =20 =20 =20 -----Original Message----- From: speechsc-bounces@ietf.org=20 [mailto:speechsc-bounces@ietf.org] On Behalf Of Dave Burke Sent: Monday, November 07, 2005 9:17 AM To: Andrew Wahbe; IETF SPEECHSC (E-mail) Subject: Re: [speechsc] VoiceXML maxspeech timeout not=20 implementable withMRCPv2 =20 I agree with Andrew's analysis and that his proposed text=20 clarification is warranted, i.e. =20 "003 recognition-timeout: RECOGNIZE completed without a=20 match due to a recognition-timeout" =20 Dave =20 ----- Original Message ----- From: "Andrew Wahbe" To: "IETF SPEECHSC (E-mail)" Sent: Friday, October 28, 2005 3:33 AM Subject: Re: [speechsc] VoiceXML maxspeech timeout not=20 implementable withMRCPv2 =20 =20 > Man I should stop writing emails so late at night: > My new proposed text for 003 wasn't changed at all. This=20 is what I am > proposing: > > 003 recognition-timeout: RECOGNIZE completed without a=20 match due to a=20 > recognition-timeout > > Andrew > > Andrew Wahbe wrote: > >> I just noticed that my completion cause code analysis=20 was wrong in=20 >> one >> case: >> >> - between drafts 06 and 07 it seems the definition=20 of Speech=20 >> Incomplete Timeout (Section 9.4.16) was changed so that=20 >> "partial-match" (013) was returned when it fired=20 instead of "no-match" >> (001). >> >> So I guess what we are saying is that another way of=20 terminating=20 >> recognition is that the speech is just not matching the=20 grammar at=20 >> all and that is what yeilds no-match (001). (Though I=20 could have=20 >> sworn that a recognizer waits until you were done=20 talking even if the=20 >> utterance wasn't matching at all and that incomplete=20 timeout was used=20 >> in that case. Anyone care to clarify?) >> >> Anyways, I don't think this changes my real point below. >> >> Andrew >> >> Andrew Wahbe wrote: >> >>> Actually, this is something I've raised before with=20 the 07 draft=20 >>> (see >>>=20 http://www1.ietf.org/mail-archive/web/speechsc/current/msg0 1510.html >>> ) >>> but is still not fixed in 08. >>> >>> I don't see a way to implement the VoiceXML maxspeech=20 timeout with=20 >>> the current recognition completion cause codes. The=20 semantics of the=20 >>> Recognition Timer match those of VoiceXML maxspeech=20 timeout. There=20 >>> are currently 2 cause codes related to this timer: >>> >>> 003 recognition-timeout: RECOGNIZE in hotword mode=20 completed without=20 >>> a match due to a recognition-timeout >>> >>> 008 success-maxtime: RECOGNIZE request terminated=20 because speech was=20 >>> too long but whatever was spoken till that point was a=20 full match. >>> >>> But what should be thrown in a non-hotword recognition=20 when the=20 >>> timer fires and there is no match? It can't be 001=20 no-match; the=20 >>> VoiceXML interpreter will not be able to distinguish a=20 "nomatch"=20 >>> from "maxspeech" in this case. >>> >>> Let's take a step back. If we really want to describe=20 what happened,=20 >>> then I think we need to communicate to the client=20 1) why the=20 >>> recognition was stopped and 2) if the result a=20 no-match, a partial=20 >>> match, or a complete match when it stopped. >>> >>> 1) The recognition can stop for the following reasons=20 (the related=20 >>> cause codes in parens): >>> >>> no-input timeout (002) >>> complete timeout (000) >>> incomplete timeout (001 013) >>> recognition timeout (003 and 008) >>> speech too early (007) >>> cancelled (011) >>> various errors (004, 005, 006, 009, 010, 012) >>> >>> 2) The result is irrelevant for no-input, speech=20 too early,=20 >>> cancelled, and the errors. This leaves complete=20 timeout, incomplete=20 >>> timeout, and recognition timeout. By the definition of=20 complete=20 >>> timeout, this is always a complete match (000). Also=20 by definition,=20 >>> incomplete timeout could result in a no match (001) or=20 a partial=20 >>> match (013) -- though VoiceXML will likely throw=20 nomatch for either. >>> >>> So we are left with recognition timeout. It could be a=20 complete=20 >>> match >>> (008 by the definition in section 9.4.11) or a partial=20 match (008 by=20 >>> the definition in section 9.4.7), or a no-match... =20 well if it was a=20 >>> hotword recognition we have 003; for the normal case,=20 we have nothing. >>> >>> So hopefully I have clarified my earlier point that=20 Recognition=20 >>> Complete Cause codes 003 and 008 need revisiting.=20 Sorry to be so=20 >>> long winded but I think it's important that VoiceXML be=20 >>> implementable using MRCPv2; it is a very common use=20 case for the specification. >>> >>> Before proposing a solution, I would like to again ask=20 why we need=20 >>> to have a special cause code for recognition-timeout=20 in the hotword=20 >>> case. The answer to this question is here: >>>=20 http://www1.ietf.org/mail-archive/web/speechsc/current/msg0 1501.html >>> >>> But I still don't see why you need a special=20 completion cause to say=20 >>> that the timer expired in hotword mode. The client=20 knows what mode=20 >>> the recognition is in. You just need to tell it that=20 the timer fired. >>> >>> What the client does need to know is if there is a=20 valid result for=20 >>> it to process. Thus I propose that 003 be=20 rewritten as follows: >>> >>> 003 recognition-timeout: RECOGNIZE in hotword mode=20 completed without=20 >>> a match due to a recognition-timeout >>> >>> This will bring it back in line with the earlier=20 Scansoft & Nuance=20 >>> proposal (well now it just a Nuance proposal) from a=20 year and a half >>> ago: >>>=20 http://www1.ietf.org/mail-archive/web/speechsc/current/msg0 0560.html >>> >>> Andrew >>> >>> _______________________________________________ >>> Speechsc mailing list >>> Speechsc@ietf.org >>> https://www1.ietf.org/mailman/listinfo/speechsc >>> >>> > =20 =20 ----------------------------------------------------------- --------------------- =20 =20 > _______________________________________________ > Speechsc mailing list > Speechsc@ietf.org > https://www1.ietf.org/mailman/listinfo/speechsc >=20 =20 =20 _______________________________________________ Speechsc mailing list Speechsc@ietf.org https://www1.ietf.org/mailman/listinfo/speechsc =20 =20 _______________________________________________ Speechsc mailing list Speechsc@ietf.org https://www1.ietf.org/mailman/listinfo/speechsc =20 _______________________________________________ Speechsc mailing list Speechsc@ietf.org https://www1.ietf.org/mailman/listinfo/speechsc From speechsc-bounces@ietf.org Thu Nov 10 16:26:19 2005 Received: from localhost.cnri.reston.va.us ([127.0.0.1] helo=megatron.ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1EaJw2-0000Is-Uk; Thu, 10 Nov 2005 16:26:18 -0500 Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1EaJw1-0000EY-VT for speechsc@megatron.ietf.org; Thu, 10 Nov 2005 16:26:18 -0500 Received: from ietf-mx.ietf.org (ietf-mx [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id QAA15897 for ; Thu, 10 Nov 2005 16:25:49 -0500 (EST) Received: from mail.voicegenie.com ([205.150.90.87] helo=voicegenie.com) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1EaKCJ-0004Nr-Iy for speechsc@ietf.org; Thu, 10 Nov 2005 16:43:08 -0500 Received: from [205.150.90.65] (parrot.voicegenie.com [205.150.90.65]) by voicegenie.com (8.11.6+Sun/8.9.3) with ESMTP id jAALPuF17609; Thu, 10 Nov 2005 16:25:57 -0500 (EST) Message-ID: <4373BAE4.3050005@voicegenie.com> Date: Thu, 10 Nov 2005 16:25:56 -0500 From: Andrew Wahbe Organization: VoiceGenie Technologies User-Agent: Mozilla Thunderbird 1.0.2 (Windows/20050317) X-Accept-Language: en-us, en MIME-Version: 1.0 To: "Shanmugham, Saravanan" Subject: Re: [speechsc] VoiceXML maxspeech timeout not implementable withMRCPv2 References: <03772D1EC8DE624A863058C75874A75C73A327@vtg-um-e2k6.sj21ad.cisco.com> In-Reply-To: <03772D1EC8DE624A863058C75874A75C73A327@vtg-um-e2k6.sj21ad.cisco.com> Content-Type: multipart/mixed; boundary="------------050001000600030907030903" X-Spam-Score: 0.0 (/) X-Scan-Signature: 3a331e4a192f4d33f18e6f8376287cf6 Cc: "IETF SPEECHSC $E-mail$" , Dave Burke X-BeenThere: speechsc@ietf.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Speech Services Control Working Group List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: speechsc-bounces@ietf.org Errors-To: speechsc-bounces@ietf.org This is a multi-part message in MIME format. --------------050001000600030907030903 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit I don't think we are all that far off, the devil is in the details. First, VoiceXML requires that your "no-match-maxtime" be distinguished from no-match. So if we want all the VoiceXML browsers out there to use MRCP then we need to make the distinction. Second, we already have no-match-maxtime: its 003 recognition-timeout. Except that it is restricted to hotword recognition in the current draft. Since the client knows the recognition mode, I propose that we re-use the code for both modes. So no-match-maxtime is the same as 003 recognition-timeout. We don't need two codes to tell the client something it already knows. I think the point that I was trying to make at the start of all this is captured above. I additionally pointed out in my last email on the topic that reliably discerning no-match from partial-match from complete-match in the return code (regardless of the timer that fired) may not be feasible nor terribly useful. What is important is if we have a result or not -- this is effectively match vs no-match but there is a subtle difference. The semantics here revolve around the presence of a usable recognition result rather than which timer fired. If we collapse the partial match codes in the the respective no-match codes then we end up with the codes looking how they did in the 06 draft -- except that we have now clarified the difference between 003 and 008. I would agree though that for "completeness", if we don't get rid of partial-match, then it may make sense to have something like partial-match-maxtime. But again, I don't think that its really worth it to have any partial-match codes at all. If we don't remove the partial-match codes and base the match vs no-match distinction on the presence of a result (rather than timers) then I would like clarification as to how confidence-threshold works into all this. If the complete-timeout fires, but the resulting match is below the confidence-threshold, then do you get a "match" result but no recognition results? That seems broken. Section 9.4 (in the definition of confidence-threshold) it actually says that a no-match must be returned in that case. So something else would have to be changed here. Andrew Shanmugham, Saravanan wrote: >I just re-read this whole description and came to a different >conlcusion. > >When a recongizer is started and the speaker starts speaking. > >The match conditions for the spoken speech and grammar matching upto >that point can be one of the following > no-match > partial-match > complete-match >You could go one step further and say a complete-match can be of 2 >sub-classes on further grammar possibilities(Is there value to doing >this. My thinking is not, Any thoughts) > complete-match-with-more-possibilities > Complete-match-with-no-more-possibilities > >Theoretically speaking, no-match could be decided the moment that the >speaker speaks something that does not match anything, which could be >even the first word. >But, if I understand right, most commercial recognizers today don't >match the speech as they are spoken, but rather wait for >end-pointing/some-silience or a timer before doing the matching. > >NOW, walking through what happens during a non-hotword RECOGNIZE >operation. > >The Recognition-Timer is started right at the beginning of the RECOGNIZE >operation. > >When such a silence happens they take the speech collected upto that >point, and match it with the grammar and do the following. > 1. If there was NO match - they generate "no-match 001" completion >cause code. > 2. If there was partial-match - They start the >Speech-Incomplete-Timer > 3. If there was complete-match(both sub types) - They start the >Speech-Complete-Timer. > >Now lets look at the cases of the different timers expiring. > >Speech-Incomplete-Timer expires, we return partial-match 013 >When Speech-Complete-Timer expires, we return "success" 000. >When Recognition-Timer expires we have multiple possibilities based on >the grammar match condition. > 1. no-match 001 > 2. partial-match 013 > 3. succcess (which is differentiated as success-maxtime 008) > >So I don't see what we are missing here. >Now theoretically, we could create new completion cause code >no-match-max-time, and partial-match-maxtime to differentiate them from >the ones that would have been generated during a silence/endpointing. > >Now If I understand Andrew, he is refering to no-match-max-time as >Recognition-Timeout, which is fine by me, but question if it is actually >necessary for the differeniation. Coz, I am thinking if we want to >different no-match-maxtime(andrew's proposed recongition-timeout), and >success-maxtime(already defined), then we should also define >partial-match-maxtime, which has a higher probablity of occuring than >success-maxtime :-)) > >What do you think, >Sarvi > > > -----Original Message----- > From: speechsc-bounces@ietf.org > [mailto:speechsc-bounces@ietf.org] On Behalf Of > Shanmugham, Saravanan > Sent: Monday, November 07, 2005 10:54 AM > To: Dave Burke; Andrew Wahbe; IETF SPEECHSC (E-mail) > Subject: RE: [speechsc] VoiceXML maxspeech timeout not > implementable withMRCPv2 > > I understand and see his point. I think what he is > proposing is what it used to be before it was changed > based on previous feed back. Hence I want to be carefull > before making this change. > > I have this as the only Open Issue in my slides for the > SpeechSC meeting. If there is concerns raised this is a > simple enough change to get into -09. > > Sarvi > > > > -----Original Message----- > From: speechsc-bounces@ietf.org > [mailto:speechsc-bounces@ietf.org] On Behalf Of Dave Burke > Sent: Monday, November 07, 2005 9:17 AM > To: Andrew Wahbe; IETF SPEECHSC (E-mail) > Subject: Re: [speechsc] VoiceXML maxspeech timeout not > implementable withMRCPv2 > > I agree with Andrew's analysis and that his proposed text > clarification is warranted, i.e. > > "003 recognition-timeout: RECOGNIZE completed without a > match due to a recognition-timeout" > > Dave > > ----- Original Message ----- > From: "Andrew Wahbe" > To: "IETF SPEECHSC (E-mail)" > Sent: Friday, October 28, 2005 3:33 AM > Subject: Re: [speechsc] VoiceXML maxspeech timeout not > implementable > withMRCPv2 > > > > Man I should stop writing emails so late at night: > > My new proposed text for 003 wasn't changed at all. This > is what I am > > proposing: > > > > 003 recognition-timeout: RECOGNIZE completed without a > match due to a > > recognition-timeout > > > > Andrew > > > > Andrew Wahbe wrote: > > > >> I just noticed that my completion cause code analysis > was wrong in > >> one > >> case: > >> > >> - between drafts 06 and 07 it seems the definition > of Speech > >> Incomplete Timeout (Section 9.4.16) was changed so that > >> "partial-match" (013) was returned when it fired > instead of "no-match" > >> (001). > >> > >> So I guess what we are saying is that another way of > terminating > >> recognition is that the speech is just not matching the > grammar at > >> all and that is what yeilds no-match (001). (Though I > could have > >> sworn that a recognizer waits until you were done > talking even if the > >> utterance wasn't matching at all and that incomplete > timeout was used > >> in that case. Anyone care to clarify?) > >> > >> Anyways, I don't think this changes my real point below. > >> > >> Andrew > >> > >> Andrew Wahbe wrote: > >> > >>> Actually, this is something I've raised before with > the 07 draft > >>> (see > >>> > http://www1.ietf.org/mail-archive/web/speechsc/current/msg0 > 1510.html > >>> ) > >>> but is still not fixed in 08. > >>> > >>> I don't see a way to implement the VoiceXML maxspeech > timeout with > >>> the current recognition completion cause codes. The > semantics of the > >>> Recognition Timer match those of VoiceXML maxspeech > timeout. There > >>> are currently 2 cause codes related to this timer: > >>> > >>> 003 recognition-timeout: RECOGNIZE in hotword mode > completed without > >>> a match due to a recognition-timeout > >>> > >>> 008 success-maxtime: RECOGNIZE request terminated > because speech was > >>> too long but whatever was spoken till that point was a > full match. > >>> > >>> But what should be thrown in a non-hotword recognition > when the > >>> timer fires and there is no match? It can't be 001 > no-match; the > >>> VoiceXML interpreter will not be able to distinguish a > "nomatch" > >>> from "maxspeech" in this case. > >>> > >>> Let's take a step back. If we really want to describe > what happened, > >>> then I think we need to communicate to the client > 1) why the > >>> recognition was stopped and 2) if the result a > no-match, a partial > >>> match, or a complete match when it stopped. > >>> > >>> 1) The recognition can stop for the following reasons > (the related > >>> cause codes in parens): > >>> > >>> no-input timeout (002) > >>> complete timeout (000) > >>> incomplete timeout (001 013) > >>> recognition timeout (003 and 008) > >>> speech too early (007) > >>> cancelled (011) > >>> various errors (004, 005, 006, 009, 010, 012) > >>> > >>> 2) The result is irrelevant for no-input, speech > too early, > >>> cancelled, and the errors. This leaves complete > timeout, incomplete > >>> timeout, and recognition timeout. By the definition of > complete > >>> timeout, this is always a complete match (000). Also > by definition, > >>> incomplete timeout could result in a no match (001) or > a partial > >>> match (013) -- though VoiceXML will likely throw > nomatch for either. > >>> > >>> So we are left with recognition timeout. It could be a > complete > >>> match > >>> (008 by the definition in section 9.4.11) or a partial > match (008 by > >>> the definition in section 9.4.7), or a no-match... > well if it was a > >>> hotword recognition we have 003; for the normal case, > we have nothing. > >>> > >>> So hopefully I have clarified my earlier point that > Recognition > >>> Complete Cause codes 003 and 008 need revisiting. > Sorry to be so > >>> long winded but I think it's important that VoiceXML be > >>> implementable using MRCPv2; it is a very common use > case for the specification. > >>> > >>> Before proposing a solution, I would like to again ask > why we need > >>> to have a special cause code for recognition-timeout > in the hotword > >>> case. The answer to this question is here: > >>> > http://www1.ietf.org/mail-archive/web/speechsc/current/msg0 > 1501.html > >>> > >>> But I still don't see why you need a special > completion cause to say > >>> that the timer expired in hotword mode. The client > knows what mode > >>> the recognition is in. You just need to tell it that > the timer fired. > >>> > >>> What the client does need to know is if there is a > valid result for > >>> it to process. Thus I propose that 003 be > rewritten as follows: > >>> > >>> 003 recognition-timeout: RECOGNIZE in hotword mode > completed without > >>> a match due to a recognition-timeout > >>> > >>> This will bring it back in line with the earlier > Scansoft & Nuance > >>> proposal (well now it just a Nuance proposal) from a > year and a half > >>> ago: > >>> > http://www1.ietf.org/mail-archive/web/speechsc/current/msg0 > 0560.html > >>> > >>> Andrew > >>> > >>> _______________________________________________ > >>> Speechsc mailing list > >>> Speechsc@ietf.org > >>> https://www1.ietf.org/mailman/listinfo/speechsc > >>> > >>> > > > > > ----------------------------------------------------------- > --------------------- > > > > _______________________________________________ > > Speechsc mailing list > > Speechsc@ietf.org > > https://www1.ietf.org/mailman/listinfo/speechsc > > > > > _______________________________________________ > Speechsc mailing list > Speechsc@ietf.org > https://www1.ietf.org/mailman/listinfo/speechsc > > > _______________________________________________ > Speechsc mailing list > Speechsc@ietf.org > https://www1.ietf.org/mailman/listinfo/speechsc > > > > > --------------050001000600030907030903 Content-Type: text/x-vcard; charset=utf-8; name="awahbe.vcf" Content-Disposition: attachment; filename="awahbe.vcf" Content-Transfer-Encoding: 7bit begin:vcard fn:Andrew Wahbe n:Wahbe;Andrew org:VoiceGenie Technologies INC.;Multimodal and Development Tools adr:8th Floor;;1120 Finch Avenue W.;Toronto;ON;M3J 3H7;Canada email;internet:awahbe@voicegenie.com title:Technical Manager tel;work:(416) 736-0905 ext. 258 tel;fax:(416) 736-1551 x-mozilla-html:TRUE url:http://www.voicegenie.com version:2.1 end:vcard --------------050001000600030907030903 Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Speechsc mailing list Speechsc@ietf.org https://www1.ietf.org/mailman/listinfo/speechsc --------------050001000600030907030903-- From speechsc-bounces@ietf.org Thu Nov 10 16:39:05 2005 Received: from localhost.cnri.reston.va.us ([127.0.0.1] helo=megatron.ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1EaK8P-0002uR-Pb; Thu, 10 Nov 2005 16:39:05 -0500 Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1EaK8O-0002tl-EF for speechsc@megatron.ietf.org; Thu, 10 Nov 2005 16:39:05 -0500 Received: from ietf-mx.ietf.org (ietf-mx [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id QAA17662 for ; Thu, 10 Nov 2005 16:38:35 -0500 (EST) Received: from sj-iport-5.cisco.com ([171.68.10.87]) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1EaKOd-00056d-TX for speechsc@ietf.org; Thu, 10 Nov 2005 16:55:54 -0500 Received: from sj-core-4.cisco.com ([171.68.223.138]) by sj-iport-5.cisco.com with ESMTP; 10 Nov 2005 13:38:36 -0800 X-IronPort-AV: i="3.99,115,1131350400"; d="scan'208"; a="229109406:sNHT774709650" Received: from vtg-um-e2k6.sj21ad.cisco.com (vtg-um-e2k6.cisco.com [171.70.93.77]) by sj-core-4.cisco.com (8.12.10/8.12.6) with ESMTP id jAALcZag016682; Thu, 10 Nov 2005 13:38:35 -0800 (PST) X-MimeOLE: Produced By Microsoft Exchange V6.5.7226.0 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Subject: RE: [speechsc] VoiceXML maxspeech timeout not implementable withMRCPv2 - RESEND Date: Thu, 10 Nov 2005 13:38:34 -0800 Message-ID: <03772D1EC8DE624A863058C75874A75C73A340@vtg-um-e2k6.sj21ad.cisco.com> Thread-Topic: [speechsc] VoiceXML maxspeech timeout not implementable withMRCPv2 - RESEND Thread-Index: AcXjv+z4t3Gy6OPdT+SBCfRZty8yUgADA4mAAJoDe9AAAsGLwA== From: "Shanmugham, Saravanan" To: "Shanmugham, Saravanan" , "Dave Burke" , "Andrew Wahbe" , "IETF SPEECHSC $E-mail$" X-Spam-Score: 0.0 (/) X-Scan-Signature: 819069d28e3cfe534e22b502261ce83f Content-Transfer-Encoding: quoted-printable Cc: X-BeenThere: speechsc@ietf.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Speech Services Control Working Group List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: speechsc-bounces@ietf.org Errors-To: speechsc-bounces@ietf.org I just re-read this whole description and came to a different conlcusion. When a recongizer is started and enough speech has been collected to be able to process. The match conditions for the spoken speech and grammar matching upto that point can be one of the following no-match partial-match complete-match You could go one step further and say a complete-match can be of 2 sub-classes on further grammar possibilities(Is there value to doing this. My thinking is not, Any thoughts) complete-match-with-more-possibilities Complete-match-with-no-more-possibilities Theoretically speaking, no-match could be decided the moment that the speaker speaks something that does not match anything, which could be even the first word. But, if I understand right, most commercial recognizers today don't match the speech as they are spoken, but rather wait for end-pointing/some-silience or a timer before doing the matching. NOW, walking through what happens during a non-hotword RECOGNIZE operation. The Recognition-Timer is started right at the beginning of the RECOGNIZE operation. When silence/enpoint happens they take the speech collected upto that point, and match it with the grammar and do the following. 1. If there was NO match - they generate "no-match 001" completion cause code. 2. If there was partial-match - They start the Speech-Incomplete-Timer 3. If there was complete-match(both sub types) - They start the Speech-Complete-Timer. Now lets look at the cases of the different timers expiring. Speech-Incomplete-Timer expires, we return partial-match 013=20 When Speech-Complete-Timer expires, we return "success" 000. When Recognition-Timer expires we have multiple possibilities based on the grammar match condition. 1. no-match 001 2. partial-match 013 3. succcess (which is differentiated as success-maxtime 008) So I don't see what we are missing here. Now theoretically, we could create new completion cause code no-match-max-time, and partial-match-maxtime to differentiate them from the ones that would have been generated during a silence/endpointing. Now If I understand Andrew, he is refering to no-match-max-time as Recognition-Timeout, which is fine by me, but question if it is actually necessary for the differeniation. Coz, I am thinking if we want to different no-match-maxtime(andrew's proposed recongition-timeout), and success-maxtime(already defined), then we should also define partial-match-maxtime, which has a higher probablity of occuring than success-maxtime :-)) What do you think, Sarvi=20 -----Original Message----- From: Shanmugham, Saravanan=20 Sent: Thursday, November 10, 2005 12:51 PM To: Shanmugham, Saravanan; Dave Burke; Andrew Wahbe; IETF=20 SPEECHSC (E-mail) Subject: RE: [speechsc] VoiceXML maxspeech timeout not=20 implementable withMRCPv2 =20 I just re-read this whole description and came to a=20 different conlcusion. =20 When a recongizer is started and enough speech has been=20 collected to be able to process. =20 The match conditions for the spoken speech and grammar=20 matching upto that point can be one of the following no-match partial-match complete-match You could go one step further and say a complete-match can=20 be of 2 sub-classes on further grammar possibilities(Is=20 there value to doing this. My thinking is not, Any thoughts) complete-match-with-more-possibilities Complete-match-with-no-more-possibilities =20 Theoretically speaking, no-match could be decided the=20 moment that the speaker speaks something that does not=20 match anything, which could be even the first word. But, if I understand right, most commercial recognizers=20 today don't match the speech as they are spoken, but=20 rather wait for end-pointing/some-silience or a timer=20 before doing the matching. =20 NOW, walking through what happens during a non-hotword=20 RECOGNIZE operation. =20 The Recognition-Timer is started right at the beginning of=20 the RECOGNIZE operation. =20 When silence/enpoint happens they take the speech=20 collected upto that point, and match it with the grammar=20 and do the following. 1. If there was NO match - they generate "no-match=20 001" completion cause code. 2. If there was partial-match - They start the=20 Speech-Incomplete-Timer 3. If there was complete-match(both sub types) - They=20 start the Speech-Complete-Timer. =20 Now lets look at the cases of the different timers expiring. =20 Speech-Incomplete-Timer expires, we return partial-match 013=20 When Speech-Complete-Timer expires, we return "success" 000. When Recognition-Timer expires we have multiple=20 possibilities based on the grammar match condition. 1. no-match 001 2. partial-match 013 3. succcess (which is differentiated as success-maxtime 008) =20 So I don't see what we are missing here. Now theoretically, we could create new completion cause=20 code no-match-max-time, and partial-match-maxtime to=20 differentiate them from the ones that would have been=20 generated during a silence/endpointing. =20 Now If I understand Andrew, he is refering to=20 no-match-max-time as Recognition-Timeout, which is fine by=20 me, but question if it is actually necessary for the=20 differeniation. Coz, I am thinking if we want to different=20 no-match-maxtime(andrew's proposed recongition-timeout),=20 and success-maxtime(already defined), then we should also=20 define partial-match-maxtime, which has a higher=20 probablity of occuring than success-maxtime :-)) =20 What do you think, Sarvi=20 =20 =20 -----Original Message----- From: speechsc-bounces@ietf.org=20 [mailto:speechsc-bounces@ietf.org] On Behalf Of=20 Shanmugham, Saravanan Sent: Monday, November 07, 2005 10:54 AM To: Dave Burke; Andrew Wahbe; IETF SPEECHSC (E-mail) Subject: RE: [speechsc] VoiceXML maxspeech timeout not=20 implementable withMRCPv2 =20 I understand and see his point. I think what he is=20 proposing is what it used to be before it was changed=20 based on previous feed back. Hence I want to be carefull=20 before making this change.=20 =20 I have this as the only Open Issue in my slides for the=20 SpeechSC meeting. If there is concerns raised this is a=20 simple enough change to get into -09. =20 Sarvi=20 =20 =20 =20 -----Original Message----- From: speechsc-bounces@ietf.org=20 [mailto:speechsc-bounces@ietf.org] On Behalf Of=20 Dave Burke Sent: Monday, November 07, 2005 9:17 AM To: Andrew Wahbe; IETF SPEECHSC (E-mail) Subject: Re: [speechsc] VoiceXML maxspeech timeout not=20 implementable withMRCPv2 =20 I agree with Andrew's analysis and that his=20 proposed text=20 clarification is warranted, i.e. =20 "003 recognition-timeout: RECOGNIZE completed without a=20 match due to a recognition-timeout" =20 Dave =20 ----- Original Message ----- From: "Andrew Wahbe" To: "IETF SPEECHSC (E-mail)" Sent: Friday, October 28, 2005 3:33 AM Subject: Re: [speechsc] VoiceXML maxspeech timeout not=20 implementable withMRCPv2 =20 =20 > Man I should stop writing emails so late at night: > My new proposed text for 003 wasn't changed at=20 all. This=20 is what I am > proposing: > > 003 recognition-timeout: RECOGNIZE completed without a=20 match due to a=20 > recognition-timeout > > Andrew > > Andrew Wahbe wrote: > >> I just noticed that my completion cause code analysis=20 was wrong in=20 >> one >> case: >> >> - between drafts 06 and 07 it seems the definition=20 of Speech=20 >> Incomplete Timeout (Section 9.4.16) was=20 changed so that=20 >> "partial-match" (013) was returned when it fired=20 instead of "no-match" >> (001). >> >> So I guess what we are saying is that another way of=20 terminating=20 >> recognition is that the speech is just not=20 matching the=20 grammar at=20 >> all and that is what yeilds no-match (001). (Though I=20 could have=20 >> sworn that a recognizer waits until you were done=20 talking even if the=20 >> utterance wasn't matching at all and that incomplete=20 timeout was used=20 >> in that case. Anyone care to clarify?) >> >> Anyways, I don't think this changes my real=20 point below. >> >> Andrew >> >> Andrew Wahbe wrote: >> >>> Actually, this is something I've raised before with=20 the 07 draft=20 >>> (see >>>=20 =20 http://www1.ietf.org/mail-archive/web/speechsc/current/msg0 1510.html >>> ) >>> but is still not fixed in 08. >>> >>> I don't see a way to implement the VoiceXML=20 maxspeech=20 timeout with=20 >>> the current recognition completion cause codes. The=20 semantics of the=20 >>> Recognition Timer match those of VoiceXML maxspeech=20 timeout. There=20 >>> are currently 2 cause codes related to this timer: >>> >>> 003 recognition-timeout: RECOGNIZE in hotword mode=20 completed without=20 >>> a match due to a recognition-timeout >>> >>> 008 success-maxtime: RECOGNIZE request terminated=20 because speech was=20 >>> too long but whatever was spoken till that=20 point was a=20 full match. >>> >>> But what should be thrown in a non-hotword=20 recognition=20 when the=20 >>> timer fires and there is no match? It can't be 001=20 no-match; the=20 >>> VoiceXML interpreter will not be able to=20 distinguish a=20 "nomatch"=20 >>> from "maxspeech" in this case. >>> >>> Let's take a step back. If we really want to=20 describe=20 what happened,=20 >>> then I think we need to communicate to the client=20 1) why the=20 >>> recognition was stopped and 2) if the result a=20 no-match, a partial=20 >>> match, or a complete match when it stopped. >>> >>> 1) The recognition can stop for the=20 following reasons=20 (the related=20 >>> cause codes in parens): >>> >>> no-input timeout (002) >>> complete timeout (000) >>> incomplete timeout (001 013) >>> recognition timeout (003 and 008) >>> speech too early (007) >>> cancelled (011) >>> various errors (004, 005, 006, 009, 010, 012) >>> >>> 2) The result is irrelevant for no-input, speech=20 too early,=20 >>> cancelled, and the errors. This leaves complete=20 timeout, incomplete=20 >>> timeout, and recognition timeout. By the=20 definition of=20 complete=20 >>> timeout, this is always a complete match (000). Also=20 by definition,=20 >>> incomplete timeout could result in a no=20 match (001) or=20 a partial=20 >>> match (013) -- though VoiceXML will likely throw=20 nomatch for either. >>> >>> So we are left with recognition timeout. It=20 could be a=20 complete=20 >>> match >>> (008 by the definition in section 9.4.11) or=20 a partial=20 match (008 by=20 >>> the definition in section 9.4.7), or a no-match... =20 well if it was a=20 >>> hotword recognition we have 003; for the=20 normal case,=20 we have nothing. >>> >>> So hopefully I have clarified my earlier point that=20 Recognition=20 >>> Complete Cause codes 003 and 008 need revisiting.=20 Sorry to be so=20 >>> long winded but I think it's important that=20 VoiceXML be=20 >>> implementable using MRCPv2; it is a very common use=20 case for the specification. >>> >>> Before proposing a solution, I would like to=20 again ask=20 why we need=20 >>> to have a special cause code for recognition-timeout=20 in the hotword=20 >>> case. The answer to this question is here: >>>=20 =20 http://www1.ietf.org/mail-archive/web/speechsc/current/msg0 1501.html >>> >>> But I still don't see why you need a special=20 completion cause to say=20 >>> that the timer expired in hotword mode. The client=20 knows what mode=20 >>> the recognition is in. You just need to tell it that=20 the timer fired. >>> >>> What the client does need to know is if there is a=20 valid result for=20 >>> it to process. Thus I propose that 003 be=20 rewritten as follows: >>> >>> 003 recognition-timeout: RECOGNIZE in hotword mode=20 completed without=20 >>> a match due to a recognition-timeout >>> >>> This will bring it back in line with the earlier=20 Scansoft & Nuance=20 >>> proposal (well now it just a Nuance proposal) from a=20 year and a half >>> ago: >>>=20 =20 http://www1.ietf.org/mail-archive/web/speechsc/current/msg0 0560.html >>> >>> Andrew >>> >>> _______________________________________________ >>> Speechsc mailing list >>> Speechsc@ietf.org >>> https://www1.ietf.org/mailman/listinfo/speechsc >>> >>> > =20 =20 =20 ----------------------------------------------------------- --------------------- =20 =20 > _______________________________________________ > Speechsc mailing list > Speechsc@ietf.org > https://www1.ietf.org/mailman/listinfo/speechsc >=20 =20 =20 _______________________________________________ Speechsc mailing list Speechsc@ietf.org https://www1.ietf.org/mailman/listinfo/speechsc =20 =20 _______________________________________________ Speechsc mailing list Speechsc@ietf.org https://www1.ietf.org/mailman/listinfo/speechsc =20 _______________________________________________ Speechsc mailing list Speechsc@ietf.org https://www1.ietf.org/mailman/listinfo/speechsc From speechsc-bounces@ietf.org Thu Nov 10 17:50:44 2005 Received: from localhost.cnri.reston.va.us ([127.0.0.1] helo=megatron.ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1EaLFk-0001Es-0s; Thu, 10 Nov 2005 17:50:44 -0500 Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1EaLFi-0001Eb-Bt for speechsc@megatron.ietf.org; Thu, 10 Nov 2005 17:50:42 -0500 Received: from ietf-mx.ietf.org (ietf-mx [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id RAA22403 for ; Thu, 10 Nov 2005 17:50:13 -0500 (EST) Received: from sj-iport-5.cisco.com ([171.68.10.87]) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1EaLW0-0007Ir-Jd for speechsc@ietf.org; Thu, 10 Nov 2005 18:07:33 -0500 Received: from sj-core-3.cisco.com ([171.68.223.137]) by sj-iport-5.cisco.com with ESMTP; 10 Nov 2005 14:50:32 -0800 X-IronPort-AV: i="3.99,115,1131350400"; d="scan'208"; a="229135600:sNHT32508662" Received: from vtg-um-e2k6.sj21ad.cisco.com (vtg-um-e2k6.cisco.com [171.70.93.77]) by sj-core-3.cisco.com (8.12.10/8.12.6) with ESMTP id jAAMoRZd003031; Thu, 10 Nov 2005 14:50:27 -0800 (PST) X-MimeOLE: Produced By Microsoft Exchange V6.5.7226.0 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Subject: RE: [speechsc] VoiceXML maxspeech timeout not implementable withMRCPv2 Date: Thu, 10 Nov 2005 14:50:29 -0800 Message-ID: <03772D1EC8DE624A863058C75874A75C73A36A@vtg-um-e2k6.sj21ad.cisco.com> Thread-Topic: [speechsc] VoiceXML maxspeech timeout not implementable withMRCPv2 Thread-Index: AcXmPXdbPvT3+15qT9ejDtRa5Gc3XQAAqr4Q From: "Shanmugham, Saravanan" To: "Andrew Wahbe" X-Spam-Score: 0.0 (/) X-Scan-Signature: 8e9fbe727bc2159b431d624c595c1eab Content-Transfer-Encoding: quoted-printable Cc: "IETF SPEECHSC $E-mail$" , Dave Burke X-BeenThere: speechsc@ietf.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Speech Services Control Working Group List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: speechsc-bounces@ietf.org Errors-To: speechsc-bounces@ietf.org Ok.=20 Here is proposal for change, that should clarify this. I will fold the description I mentioned earlier into the RECOGNIZE definition section of the draft. I will then add the following 2 cause-codes partial-match-maxtime (014) no-match-maxtime (015) Recognition-timeout - renamed to "hotword-maxtime" for the expiry of Recognition-Timer in the Hotword mode recognition. That also gets us away from finishing the list of cause codes list with the unlucky 13 :-)) We then add the following clarifications for non-hotword recognition mode. 1. The recongizer MUST support detecting no-match condition upon detecting end of speech. The recognizer MAY support detecting no-match condition before waiting for end-ofspeech. If this is supported, this capability is enabled by setting "Early-NoMatch" header to "true". - upon detecting a no match condition the RECOGNIZE MUST return with "no-match" 2. When the Speech-Incomplete-Timer expires - SHOULD return "partial-match" unless the recognizer cannot differentiate a partial-match in which case it MUST return "no-match". The recognizer MAY return results for the partially matched grammar. 3. When the Speech-Complete-Timer expires - MUST return "success" 4. When the Recognition-Timer expires 4.1 And there is a partial-match -SHOULD return "partial-match-maxtime" unless the recognizer cannot differentiate a partial-match in which case it MUST return "no-match-maxtime". The recognizer MAY return results for the partially matched grammar. 4.2 And there is full-match - MUST return "success-maxtime" 4.2 And there is no match - MUST return "no-match-maxtime" For the Hotword mode recognition. 1. The Recognition-Timer gets started at the beginning of RECOGNIZE. 2. When there is match at anytime, the RECOGNIZE completes with "success". 3. When the Recognition-Timer expires and there is not match, the RECOGNIZE MUST complete with "hotword-maxtime" 4. When the Recognition-Timer expires and there is a match, the RECOGNIZE MUST return "success". Thx, Sarvi, Dan Burnett -----Original Message----- From: Andrew Wahbe [mailto:awahbe@voicegenie.com]=20 Sent: Thursday, November 10, 2005 1:26 PM To: Shanmugham, Saravanan Cc: Dave Burke; IETF SPEECHSC (E-mail) Subject: Re: [speechsc] VoiceXML maxspeech timeout not=20 implementable withMRCPv2 =20 I don't think we are all that far off, the devil is in the details. =20 First, VoiceXML requires that your "no-match-maxtime" be=20 distinguished from no-match. So if we want all the=20 VoiceXML browsers out there to use MRCP then we need to=20 make the distinction. =20 Second, we already have no-match-maxtime: its 003=20 recognition-timeout.=20 Except that it is restricted to hotword recognition in the=20 current draft. Since the client knows the recognition=20 mode, I propose that we re-use the code for both modes. So=20 no-match-maxtime is the same as 003 recognition-timeout.=20 We don't need two codes to tell the client something it=20 already knows. =20 I think the point that I was trying to make at the start=20 of all this is captured above. =20 I additionally pointed out in my last email on the topic=20 that reliably discerning no-match from partial-match from=20 complete-match in the return code (regardless of the timer=20 that fired) may not be feasible nor terribly useful. What=20 is important is if we have a result or not -- this is=20 effectively match vs no-match but there is a subtle=20 difference. The semantics here revolve around the presence=20 of a usable recognition result rather than which timer=20 fired. If we collapse the partial match codes in the the=20 respective no-match codes then we end up with the codes=20 looking how they did in the 06 draft -- except that we=20 have now clarified the difference between 003 and 008. =20 I would agree though that for "completeness", if we don't=20 get rid of partial-match, then it may make sense to have=20 something like partial-match-maxtime. But again, I don't=20 think that its really worth it to have any partial-match=20 codes at all. =20 If we don't remove the partial-match codes and base the=20 match vs no-match distinction on the presence of a result=20 (rather than timers) then I would like clarification as to=20 how confidence-threshold works into all this. If the=20 complete-timeout fires, but the resulting match is below=20 the confidence-threshold, then do you get a "match" result=20 but no recognition results? That seems broken. Section 9.4=20 (in the definition of confidence-threshold) it actually=20 says that a no-match must be returned in that case. So=20 something else would have to be changed here. =20 Andrew =20 =20 Shanmugham, Saravanan wrote: =20 >I just re-read this whole description and came to a different >conlcusion. > >When a recongizer is started and the speaker starts speaking. > >The match conditions for the spoken speech and grammar=20 matching upto >that point can be one of the following > no-match > partial-match > complete-match >You could go one step further and say a complete-match can be of 2 >sub-classes on further grammar possibilities(Is there=20 value to doing >this. My thinking is not, Any thoughts) > complete-match-with-more-possibilities > Complete-match-with-no-more-possibilities > >Theoretically speaking, no-match could be decided the=20 moment that the >speaker speaks something that does not match anything,=20 which could be >even the first word. >But, if I understand right, most commercial recognizers=20 today don't >match the speech as they are spoken, but rather wait for >end-pointing/some-silience or a timer before doing the matching. > >NOW, walking through what happens during a non-hotword RECOGNIZE >operation. > >The Recognition-Timer is started right at the beginning=20 of the RECOGNIZE >operation. > >When such a silence happens they take the speech=20 collected upto that >point, and match it with the grammar and do the following. > 1. If there was NO match - they generate "no-match=20 001" completion >cause code. > 2. If there was partial-match - They start the >Speech-Incomplete-Timer > 3. If there was complete-match(both sub types) - They start the >Speech-Complete-Timer. > >Now lets look at the cases of the different timers expiring. > >Speech-Incomplete-Timer expires, we return partial-match 013 >When Speech-Complete-Timer expires, we return "success" 000. >When Recognition-Timer expires we have multiple=20 possibilities based on >the grammar match condition. > 1. no-match 001 > 2. partial-match 013 > 3. succcess (which is differentiated as success-maxtime 008) > >So I don't see what we are missing here. >Now theoretically, we could create new completion cause code >no-match-max-time, and partial-match-maxtime to=20 differentiate them from >the ones that would have been generated during a=20 silence/endpointing. > >Now If I understand Andrew, he is refering to no-match-max-time as >Recognition-Timeout, which is fine by me, but question if=20 it is actually >necessary for the differeniation. Coz, I am thinking if we want to >different no-match-maxtime(andrew's proposed=20 recongition-timeout), and >success-maxtime(already defined), then we should also define >partial-match-maxtime, which has a higher probablity of=20 occuring than >success-maxtime :-)) > >What do you think, >Sarvi=20 > > > -----Original Message----- > From: speechsc-bounces@ietf.org=20 > [mailto:speechsc-bounces@ietf.org] On Behalf Of=20 > Shanmugham, Saravanan > Sent: Monday, November 07, 2005 10:54 AM > To: Dave Burke; Andrew Wahbe; IETF SPEECHSC (E-mail) > Subject: RE: [speechsc] VoiceXML maxspeech timeout not=20 > implementable withMRCPv2 > =20 > I understand and see his point. I think what he is=20 > proposing is what it used to be before it was changed=20 > based on previous feed back. Hence I want to be carefull=20 > before making this change.=20 > =20 > I have this as the only Open Issue in my slides for the=20 > SpeechSC meeting. If there is concerns raised this is a=20 > simple enough change to get into -09. > =20 > Sarvi=20 > =20 > =20 > =20 > -----Original Message----- > From: speechsc-bounces@ietf.org=20 > [mailto:speechsc-bounces@ietf.org] On Behalf Of=20 Dave Burke > Sent: Monday, November 07, 2005 9:17 AM > To: Andrew Wahbe; IETF SPEECHSC (E-mail) > Subject: Re: [speechsc] VoiceXML maxspeech timeout not=20 > implementable withMRCPv2 > =20 > I agree with Andrew's analysis and that his=20 proposed text=20 > clarification is warranted, i.e. > =20 > "003 recognition-timeout: RECOGNIZE completed without a=20 > match due to a recognition-timeout" > =20 > Dave > =20 > ----- Original Message ----- > From: "Andrew Wahbe" > To: "IETF SPEECHSC (E-mail)" > Sent: Friday, October 28, 2005 3:33 AM > Subject: Re: [speechsc] VoiceXML maxspeech timeout not=20 > implementable > withMRCPv2 > =20 > =20 > > Man I should stop writing emails so late at night: > > My new proposed text for 003 wasn't changed=20 at all. This=20 > is what I am > > proposing: > > > > 003 recognition-timeout: RECOGNIZE completed=20 without a=20 > match due to a=20 > > recognition-timeout > > > > Andrew > > > > Andrew Wahbe wrote: > > > >> I just noticed that my completion cause code=20 analysis=20 > was wrong in=20 > >> one > >> case: > >> > >> - between drafts 06 and 07 it seems the definition=20 > of Speech=20 > >> Incomplete Timeout (Section 9.4.16) was=20 changed so that=20 > >> "partial-match" (013) was returned when it fired=20 > instead of "no-match" > >> (001). > >> > >> So I guess what we are saying is that another way of=20 > terminating=20 > >> recognition is that the speech is just not=20 matching the=20 > grammar at=20 > >> all and that is what yeilds no-match (001).=20 (Though I=20 > could have=20 > >> sworn that a recognizer waits until you were done=20 > talking even if the=20 > >> utterance wasn't matching at all and that incomplete=20 > timeout was used=20 > >> in that case. Anyone care to clarify?) > >> > >> Anyways, I don't think this changes my real=20 point below. > >> > >> Andrew > >> > >> Andrew Wahbe wrote: > >> > >>> Actually, this is something I've raised before with=20 > the 07 draft=20 > >>> (see > >>>=20 > =20 http://www1.ietf.org/mail-archive/web/speechsc/current/msg0 > 1510.html > >>> ) > >>> but is still not fixed in 08. > >>> > >>> I don't see a way to implement the VoiceXML=20 maxspeech=20 > timeout with=20 > >>> the current recognition completion cause codes. The=20 > semantics of the=20 > >>> Recognition Timer match those of VoiceXML maxspeech=20 > timeout. There=20 > >>> are currently 2 cause codes related to this timer: > >>> > >>> 003 recognition-timeout: RECOGNIZE in hotword mode=20 > completed without=20 > >>> a match due to a recognition-timeout > >>> > >>> 008 success-maxtime: RECOGNIZE request terminated=20 > because speech was=20 > >>> too long but whatever was spoken till that=20 point was a=20 > full match. > >>> > >>> But what should be thrown in a non-hotword=20 recognition=20 > when the=20 > >>> timer fires and there is no match? It can't be 001=20 > no-match; the=20 > >>> VoiceXML interpreter will not be able to=20 distinguish a=20 > "nomatch"=20 > >>> from "maxspeech" in this case. > >>> > >>> Let's take a step back. If we really want=20 to describe=20 > what happened,=20 > >>> then I think we need to communicate to the client=20 > 1) why the=20 > >>> recognition was stopped and 2) if the result a=20 > no-match, a partial=20 > >>> match, or a complete match when it stopped. > >>> > >>> 1) The recognition can stop for the=20 following reasons=20 > (the related=20 > >>> cause codes in parens): > >>> > >>> no-input timeout (002) > >>> complete timeout (000) > >>> incomplete timeout (001 013) > >>> recognition timeout (003 and 008) > >>> speech too early (007) > >>> cancelled (011) > >>> various errors (004, 005, 006, 009, 010, 012) > >>> > >>> 2) The result is irrelevant for no-input, speech=20 > too early,=20 > >>> cancelled, and the errors. This leaves complete=20 > timeout, incomplete=20 > >>> timeout, and recognition timeout. By the=20 definition of=20 > complete=20 > >>> timeout, this is always a complete match=20 (000). Also=20 > by definition,=20 > >>> incomplete timeout could result in a no=20 match (001) or=20 > a partial=20 > >>> match (013) -- though VoiceXML will likely throw=20 > nomatch for either. > >>> > >>> So we are left with recognition timeout. It=20 could be a=20 > complete=20 > >>> match > >>> (008 by the definition in section 9.4.11)=20 or a partial=20 > match (008 by=20 > >>> the definition in section 9.4.7), or a no-match... =20 > well if it was a=20 > >>> hotword recognition we have 003; for the=20 normal case,=20 > we have nothing. > >>> > >>> So hopefully I have clarified my earlier point that=20 > Recognition=20 > >>> Complete Cause codes 003 and 008 need revisiting.=20 > Sorry to be so=20 > >>> long winded but I think it's important that=20 VoiceXML be=20 > >>> implementable using MRCPv2; it is a very common use=20 > case for the specification. > >>> > >>> Before proposing a solution, I would like=20 to again ask=20 > why we need=20 > >>> to have a special cause code for=20 recognition-timeout=20 > in the hotword=20 > >>> case. The answer to this question is here: > >>>=20 > =20 http://www1.ietf.org/mail-archive/web/speechsc/current/msg0 > 1501.html > >>> > >>> But I still don't see why you need a special=20 > completion cause to say=20 > >>> that the timer expired in hotword mode. The client=20 > knows what mode=20 > >>> the recognition is in. You just need to=20 tell it that=20 > the timer fired. > >>> > >>> What the client does need to know is if there is a=20 > valid result for=20 > >>> it to process. Thus I propose that 003 be=20 > rewritten as follows: > >>> > >>> 003 recognition-timeout: RECOGNIZE in hotword mode=20 > completed without=20 > >>> a match due to a recognition-timeout > >>> > >>> This will bring it back in line with the earlier=20 > Scansoft & Nuance=20 > >>> proposal (well now it just a Nuance=20 proposal) from a=20 > year and a half > >>> ago: > >>>=20 > =20 http://www1.ietf.org/mail-archive/web/speechsc/current/msg0 > 0560.html > >>> > >>> Andrew > >>> > >>> _______________________________________________ > >>> Speechsc mailing list > >>> Speechsc@ietf.org > >>> https://www1.ietf.org/mailman/listinfo/speechsc > >>> > >>> > > > =20 > =20 > =20 ----------------------------------------------------------- > --------------------- > =20 > =20 > > _______________________________________________ > > Speechsc mailing list > > Speechsc@ietf.org > > https://www1.ietf.org/mailman/listinfo/speechsc > >=20 > =20 > =20 > _______________________________________________ > Speechsc mailing list > Speechsc@ietf.org > https://www1.ietf.org/mailman/listinfo/speechsc > =20 > =20 > _______________________________________________ > Speechsc mailing list > Speechsc@ietf.org > https://www1.ietf.org/mailman/listinfo/speechsc > =20 > > > =20 > =20 _______________________________________________ Speechsc mailing list Speechsc@ietf.org https://www1.ietf.org/mailman/listinfo/speechsc From speechsc-bounces@ietf.org Thu Nov 10 23:27:37 2005 Received: from localhost.cnri.reston.va.us ([127.0.0.1] helo=megatron.ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1EaQVl-0004GQ-QY; Thu, 10 Nov 2005 23:27:37 -0500 Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1EaQVk-0004GL-TZ for speechsc@megatron.ietf.org; Thu, 10 Nov 2005 23:27:36 -0500 Received: from ietf-mx.ietf.org (ietf-mx [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id XAA11412 for ; Thu, 10 Nov 2005 23:27:07 -0500 (EST) Received: from mxgate1.brooktrout.com ([204.176.74.10]) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1EaQm4-0007eR-NT for speechsc@ietf.org; Thu, 10 Nov 2005 23:44:30 -0500 X-IronPort-AV: i="3.99,115,1131339600"; d="scan'208"; a="22559549:sNHT27566304" Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable X-MimeOLE: Produced By Microsoft Exchange V6.5.7226.0 Date: Thu, 10 Nov 2005 23:27:13 -0500 Message-ID: <330A23D8336C0346B5C1A5BB19666647018F8128@ATLANTIS.Brooktrout.com> Thread-Topic: DRAFT Minutes Thread-Index: AcXmeDCYebNXQWN6Ss2D7m7UaLhD7w== From: "Burger, Eric" To: X-Spam-Score: 0.0 (/) X-Scan-Signature: 7aefe408d50e9c7c47615841cb314bed Content-Transfer-Encoding: quoted-printable Subject: [Speechsc] DRAFT Minutes X-BeenThere: speechsc@ietf.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Speech Services Control Working Group List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: speechsc-bounces@ietf.org Errors-To: speechsc-bounces@ietf.org The draft minutes of the meeting are at: http://onsite.ietf.org/proceedings/05nov/minutes/speechsc.txt That may flip to: http://tools.ietf.org/proceedings/05nov/minutes/speechsc.txt PLEASE review the minutes and submit comments by 18 November. _______________________________________________ Speechsc mailing list Speechsc@ietf.org https://www1.ietf.org/mailman/listinfo/speechsc From speechsc-bounces@ietf.org Fri Nov 11 09:33:22 2005 Received: from localhost.cnri.reston.va.us ([127.0.0.1] helo=megatron.ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1EaZxy-0000rT-6Q; Fri, 11 Nov 2005 09:33:22 -0500 Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1EaZxv-0000ps-JL; Fri, 11 Nov 2005 09:33:19 -0500 Received: from ietf-mx.ietf.org (ietf-mx [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id JAA09268; Fri, 11 Nov 2005 09:32:48 -0500 (EST) Received: from e31.co.us.ibm.com ([32.97.110.149]) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1EaaCe-0005Ip-BA; Fri, 11 Nov 2005 09:48:33 -0500 Received: from d03relay04.boulder.ibm.com (d03relay04.boulder.ibm.com [9.17.195.106]) by e31.co.us.ibm.com (8.12.11/8.12.11) with ESMTP id jABEVItC001994; Fri, 11 Nov 2005 09:31:18 -0500 Received: from d03av01.boulder.ibm.com (d03av01.boulder.ibm.com [9.17.195.167]) by d03relay04.boulder.ibm.com (8.12.10/NCO/VERS6.8) with ESMTP id jABEWUVX067040; Fri, 11 Nov 2005 07:32:30 -0700 Received: from d03av01.boulder.ibm.com (loopback [127.0.0.1]) by d03av01.boulder.ibm.com (8.12.11/8.13.3) with ESMTP id jABEVHVv026951; Fri, 11 Nov 2005 07:31:17 -0700 Received: from d03nm119.boulder.ibm.com (d03nm119.boulder.ibm.com [9.17.195.145]) by d03av01.boulder.ibm.com (8.12.11/8.12.11) with ESMTP id jABEVHF3026938; Fri, 11 Nov 2005 07:31:17 -0700 In-Reply-To: <330A23D8336C0346B5C1A5BB19666647018F8128@ATLANTIS.Brooktrout.com> To: "Burger, Eric" MIME-Version: 1.0 Subject: Re: [Speechsc] DRAFT Minutes X-Mailer: Lotus Notes Release 6.0.2CF1 June 9, 2003 Message-ID: From: Brett Gavagni Date: Fri, 11 Nov 2005 09:32:56 -0500 X-MIMETrack: Serialize by Router on D03NM119/03/M/IBM(Release 6.53HF654 | July 22, 2005) at 11/11/2005 07:32:57, Serialize complete at 11/11/2005 07:32:57 Content-Type: text/plain; charset="US-ASCII" X-Spam-Score: 0.0 (/) X-Scan-Signature: 0ddefe323dd869ab027dbfff7eff0465 Cc: speechsc@ietf.org, speechsc-bounces@ietf.org X-BeenThere: speechsc@ietf.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Speech Services Control Working Group List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: speechsc-bounces@ietf.org Errors-To: speechsc-bounces@ietf.org Did you mean to note DMSP in lieu of DMCP as listed in the notes: Distributed Multimodal Synchronization Protocol http://www.ietf.org/internet-drafts/draft-engelsma-dmsp-00.txt Eric asked about DMCP submission David asked if anyone would champion DMCP, in the absence of input will not do anything. David/Eric are willing to help Thanks, Brett Gavagni WebSphere Voice Server Development http://www-306.ibm.com/software/pervasive/voice_server/ gavagni@us.ibm.com "Burger, Eric" Sent by: speechsc-bounces@ietf.org 11/10/2005 11:27 PM To cc Subject [Speechsc] DRAFT Minutes The draft minutes of the meeting are at: http://onsite.ietf.org/proceedings/05nov/minutes/speechsc.txt That may flip to: http://tools.ietf.org/proceedings/05nov/minutes/speechsc.txt PLEASE review the minutes and submit comments by 18 November. _______________________________________________ Speechsc mailing list Speechsc@ietf.org https://www1.ietf.org/mailman/listinfo/speechsc _______________________________________________ Speechsc mailing list Speechsc@ietf.org https://www1.ietf.org/mailman/listinfo/speechsc From speechsc-bounces@ietf.org Mon Nov 14 06:26:57 2005 Received: from localhost.cnri.reston.va.us ([127.0.0.1] helo=megatron.ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1EbcUD-0001Ld-3e; Mon, 14 Nov 2005 06:26:57 -0500 Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1EbcUA-0001LR-JF for speechsc@megatron.ietf.org; Mon, 14 Nov 2005 06:26:55 -0500 Received: from ietf-mx.ietf.org (ietf-mx [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id GAA13173 for ; Mon, 14 Nov 2005 06:26:23 -0500 (EST) Received: from [195.222.227.20] (helo=gromit.ibp.de ident=Debian-exim) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1EbclA-00006L-Cv for speechsc@ietf.org; Mon, 14 Nov 2005 06:44:30 -0500 Received: from [195.222.227.22] (helo=[195.222.227.22]) by gromit.ibp.de with esmtp (Exim 4.50) id 1EbcCH-0006qB-CP; Mon, 14 Nov 2005 12:08:28 +0100 Message-ID: <43787452.1060503@ibp.de> Date: Mon, 14 Nov 2005 12:26:10 +0100 From: Claudia Daboul User-Agent: Mozilla Thunderbird 0.8 (Windows/20040913) X-Accept-Language: en-us, en MIME-Version: 1.0 To: "Shanmugham, Saravanan" , Andrew Wahbe Subject: Re: [speechsc] VoiceXML maxspeech timeout not implementable withMRCPv2 References: <03772D1EC8DE624A863058C75874A75C73A36A@vtg-um-e2k6.sj21ad.cisco.com> In-Reply-To: <03772D1EC8DE624A863058C75874A75C73A36A@vtg-um-e2k6.sj21ad.cisco.com> X-Enigmail-Version: 0.86.1.0 X-Enigmail-Supports: pgp-inline, pgp-mime Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Score: 0.0 (/) X-Spam-Score: 0.0 (/) X-Scan-Signature: 9724479da43a8325ad975c1a9b841870 Content-Transfer-Encoding: 7bit Cc: "IETF SPEECHSC $E-mail$" X-BeenThere: speechsc@ietf.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Speech Services Control Working Group List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: speechsc-bounces@ietf.org Errors-To: speechsc-bounces@ietf.org I agree with Andrew that partial results are problematic in particular with respect to semantic interpretation. I still think it's good to leave the option to the recognizer to return recognition results even if there's no complete match. As I understand the current specification, it is not ruled out to return results even in the nomatch (001) case, these might be partial results, complete results with a low confidence or other forms of incomplete results. For example in the nomatch case from my example: 4) "Two coffee, a squishy and a beer." It could either return a partial result: A) Two coffee Or it could return a complete result, but with a low confidence like: B) "Two coffee, a pepsy and a beer." (conf = 30) and the application could use the confidences of the individual items coffee (conf=80), pepsy (conf=10), beer (conf=86) Maybe it could even return something like C) "Two coffee, a <$unknown:> and a beer." I don't think it's necessary to introduce further return codes to cover the different kinds of results. It should rather become part of the standardisation of recognition results to cover also various forms of incomplete results. I don't agree with Andrews interpretation in the case of the recognition timeout (5-5''). I think in this case it's purely accidental to get a complete match. Usually (in the generic case) the utterance would be cut off in the middle of a word. In an application I would only distinguish between the cases where the utterance, up to the cutoff, matched the grammar (partially) or where it didn't match. (I would only make this distinction, if the recognizer indeed would return partial results which the application could use.) Also, as I said before, I would usually treat the no-match-maxtime case the same as the usual no-match, but as Andrew pointed out the distinction is made by VoiceXML. Therefore my suggestion is to define the return codes as follows: When "incomplete-timout" fired we may get: -no-match (001) -partial-match (013) -success (000)(could come already after "complete-timeout" expired) When "recognition-timeout" fired we may get: -no-match-maxtime (003) -partial-match-maxtime (008) - no extra code for "success-maxtime", because success in this case would be accidental and should not be trusted. The case would be included in (008). If you don't want to distinguish the "partial results"-case, just map 013 to 001 and 008 to 003. Then you have exactly the completion events defined in Voice-XML. I am content also with Sarvi's latest suggestion, since it is clearly specified when which case applies and I can still decide to treat (014) and (008) the same. I also think the "Early-NoMatch" header is a good idea. As I noted in my last mail some recognizers have different cases of the "complete-timeout" corresponding to the cases: complete-match-with-more-possibilities Complete-match-with-no-more-possibilities They are ordered such that complete-timeout-wnmp < complete-timeout-wmp < incomplete-timeout. I propose to introduce this distinction. I think it's useful for tuning an applications reaction times. There would be no extra return codes necessary. Both those timeouts should only apply, if the match was reached with a high enough confidence. If it would result in a confidence below confidence-threshold, the recognizer should wait in any case until the "incomplete-timeout" expires. Claudia Shanmugham, Saravanan wrote: > Ok. > Here is proposal for change, that should clarify this. > > I will fold the description I mentioned earlier into the RECOGNIZE > definition section of the draft. > > I will then add the following 2 cause-codes > partial-match-maxtime (014) > no-match-maxtime (015) > Recognition-timeout - renamed to "hotword-maxtime" for the expiry of > Recognition-Timer in the Hotword mode recognition. > > That also gets us away from finishing the list of cause codes list with > the unlucky 13 :-)) > > We then add the following clarifications for non-hotword recognition > mode. > 1. The recongizer MUST support detecting no-match condition upon > detecting end of speech. The recognizer MAY support detecting no-match > condition before waiting for end-ofspeech. If this is supported, this > capability is enabled by setting "Early-NoMatch" header to "true". - > upon detecting a no match condition the RECOGNIZE MUST return with > "no-match" > 2. When the Speech-Incomplete-Timer expires - SHOULD return > "partial-match" unless the recognizer cannot differentiate a > partial-match in which case it MUST return "no-match". The recognizer > MAY return results for the partially matched grammar. > 3. When the Speech-Complete-Timer expires - MUST return "success" > 4. When the Recognition-Timer expires > 4.1 And there is a partial-match -SHOULD return > "partial-match-maxtime" unless the recognizer cannot differentiate a > partial-match in which case it MUST return "no-match-maxtime". The > recognizer MAY return results for the partially matched grammar. > 4.2 And there is full-match - MUST return "success-maxtime" > 4.2 And there is no match - MUST return "no-match-maxtime" > > For the Hotword mode recognition. > 1. The Recognition-Timer gets started at the beginning of > RECOGNIZE. > 2. When there is match at anytime, the RECOGNIZE completes with > "success". > 3. When the Recognition-Timer expires and there is not match, the > RECOGNIZE MUST complete with "hotword-maxtime" > 4. When the Recognition-Timer expires and there is a match, the > RECOGNIZE MUST return "success". > > Thx, > Sarvi, Dan Burnett > > -----Original Message----- > From: Andrew Wahbe [mailto:awahbe@voicegenie.com] > Sent: Thursday, November 10, 2005 1:26 PM > To: Shanmugham, Saravanan > Cc: Dave Burke; IETF SPEECHSC (E-mail) > Subject: Re: [speechsc] VoiceXML maxspeech timeout not > implementable withMRCPv2 > > I don't think we are all that far off, the devil is in the details. > > First, VoiceXML requires that your "no-match-maxtime" be > distinguished from no-match. So if we want all the > VoiceXML browsers out there to use MRCP then we need to > make the distinction. > > Second, we already have no-match-maxtime: its 003 > recognition-timeout. > Except that it is restricted to hotword recognition in the > current draft. Since the client knows the recognition > mode, I propose that we re-use the code for both modes. So > no-match-maxtime is the same as 003 recognition-timeout. > We don't need two codes to tell the client something it > already knows. > > I think the point that I was trying to make at the start > of all this is captured above. > > I additionally pointed out in my last email on the topic > that reliably discerning no-match from partial-match from > complete-match in the return code (regardless of the timer > that fired) may not be feasible nor terribly useful. What > is important is if we have a result or not -- this is > effectively match vs no-match but there is a subtle > difference. The semantics here revolve around the presence > of a usable recognition result rather than which timer > fired. If we collapse the partial match codes in the the > respective no-match codes then we end up with the codes > looking how they did in the 06 draft -- except that we > have now clarified the difference between 003 and 008. > > I would agree though that for "completeness", if we don't > get rid of partial-match, then it may make sense to have > something like partial-match-maxtime. But again, I don't > think that its really worth it to have any partial-match > codes at all. > > If we don't remove the partial-match codes and base the > match vs no-match distinction on the presence of a result > (rather than timers) then I would like clarification as to > how confidence-threshold works into all this. If the > complete-timeout fires, but the resulting match is below > the confidence-threshold, then do you get a "match" result > but no recognition results? That seems broken. Section 9.4 > (in the definition of confidence-threshold) it actually > says that a no-match must be returned in that case. So > something else would have to be changed here. > > Andrew > > > Shanmugham, Saravanan wrote: > > >I just re-read this whole description and came to a different > >conlcusion. > > > >When a recongizer is started and the speaker starts speaking. > > > >The match conditions for the spoken speech and grammar > matching upto > >that point can be one of the following > > no-match > > partial-match > > complete-match > >You could go one step further and say a complete-match can be of 2 > >sub-classes on further grammar possibilities(Is there > value to doing > >this. My thinking is not, Any thoughts) > > complete-match-with-more-possibilities > > Complete-match-with-no-more-possibilities > > > >Theoretically speaking, no-match could be decided the > moment that the > >speaker speaks something that does not match anything, > which could be > >even the first word. > >But, if I understand right, most commercial recognizers > today don't > >match the speech as they are spoken, but rather wait for > >end-pointing/some-silience or a timer before doing the matching. > > > >NOW, walking through what happens during a non-hotword RECOGNIZE > >operation. > > > >The Recognition-Timer is started right at the beginning > of the RECOGNIZE > >operation. > > > >When such a silence happens they take the speech > collected upto that > >point, and match it with the grammar and do the following. > > 1. If there was NO match - they generate "no-match > 001" completion > >cause code. > > 2. If there was partial-match - They start the > >Speech-Incomplete-Timer > > 3. If there was complete-match(both sub types) - They start the > >Speech-Complete-Timer. > > > >Now lets look at the cases of the different timers expiring. > > > >Speech-Incomplete-Timer expires, we return partial-match 013 > >When Speech-Complete-Timer expires, we return "success" 000. > >When Recognition-Timer expires we have multiple > possibilities based on > >the grammar match condition. > > 1. no-match 001 > > 2. partial-match 013 > > 3. succcess (which is differentiated as success-maxtime 008) > > > >So I don't see what we are missing here. > >Now theoretically, we could create new completion cause code > >no-match-max-time, and partial-match-maxtime to > differentiate them from > >the ones that would have been generated during a > silence/endpointing. > > > >Now If I understand Andrew, he is refering to no-match-max-time as > >Recognition-Timeout, which is fine by me, but question if > it is actually > >necessary for the differeniation. Coz, I am thinking if we want to > >different no-match-maxtime(andrew's proposed > recongition-timeout), and > >success-maxtime(already defined), then we should also define > >partial-match-maxtime, which has a higher probablity of > occuring than > >success-maxtime :-)) > > > >What do you think, > >Sarvi > > > > > > -----Original Message----- > > From: speechsc-bounces@ietf.org > > [mailto:speechsc-bounces@ietf.org] On Behalf Of > > Shanmugham, Saravanan > > Sent: Monday, November 07, 2005 10:54 AM > > To: Dave Burke; Andrew Wahbe; IETF SPEECHSC (E-mail) > > Subject: RE: [speechsc] VoiceXML maxspeech timeout not > > implementable withMRCPv2 > > > > I understand and see his point. I think what he is > > proposing is what it used to be before it was changed > > based on previous feed back. Hence I want to be carefull > > before making this change. > > > > I have this as the only Open Issue in my slides for the > > SpeechSC meeting. If there is concerns raised this is a > > simple enough change to get into -09. > > > > Sarvi > > > > > > > > -----Original Message----- > > From: speechsc-bounces@ietf.org > > [mailto:speechsc-bounces@ietf.org] On Behalf Of > Dave Burke > > Sent: Monday, November 07, 2005 9:17 AM > > To: Andrew Wahbe; IETF SPEECHSC (E-mail) > > Subject: Re: [speechsc] VoiceXML maxspeech timeout not > > implementable withMRCPv2 > > > > I agree with Andrew's analysis and that his > proposed text > > clarification is warranted, i.e. > > > > "003 recognition-timeout: RECOGNIZE completed without a > > match due to a recognition-timeout" > > > > Dave > > > > ----- Original Message ----- > > From: "Andrew Wahbe" > > To: "IETF SPEECHSC (E-mail)" > > Sent: Friday, October 28, 2005 3:33 AM > > Subject: Re: [speechsc] VoiceXML maxspeech timeout not > > implementable > > withMRCPv2 > > > > > > > Man I should stop writing emails so late at night: > > > My new proposed text for 003 wasn't changed > at all. This > > is what I am > > > proposing: > > > > > > 003 recognition-timeout: RECOGNIZE completed > without a > > match due to a > > > recognition-timeout > > > > > > Andrew > > > > > > Andrew Wahbe wrote: > > > > > >> I just noticed that my completion cause code > analysis > > was wrong in > > >> one > > >> case: > > >> > > >> - between drafts 06 and 07 it seems the definition > > of Speech > > >> Incomplete Timeout (Section 9.4.16) was > changed so that > > >> "partial-match" (013) was returned when it fired > > instead of "no-match" > > >> (001). > > >> > > >> So I guess what we are saying is that another way of > > terminating > > >> recognition is that the speech is just not > matching the > > grammar at > > >> all and that is what yeilds no-match (001). > (Though I > > could have > > >> sworn that a recognizer waits until you were done > > talking even if the > > >> utterance wasn't matching at all and that incomplete > > timeout was used > > >> in that case. Anyone care to clarify?) > > >> > > >> Anyways, I don't think this changes my real > point below. > > >> > > >> Andrew > > >> > > >> Andrew Wahbe wrote: > > >> > > >>> Actually, this is something I've raised before with > > the 07 draft > > >>> (see > > >>> > > > http://www1.ietf.org/mail-archive/web/speechsc/current/msg0 > > 1510.html > > >>> ) > > >>> but is still not fixed in 08. > > >>> > > >>> I don't see a way to implement the VoiceXML > maxspeech > > timeout with > > >>> the current recognition completion cause codes. The > > semantics of the > > >>> Recognition Timer match those of VoiceXML maxspeech > > timeout. There > > >>> are currently 2 cause codes related to this timer: > > >>> > > >>> 003 recognition-timeout: RECOGNIZE in hotword mode > > completed without > > >>> a match due to a recognition-timeout > > >>> > > >>> 008 success-maxtime: RECOGNIZE request terminated > > because speech was > > >>> too long but whatever was spoken till that > point was a > > full match. > > >>> > > >>> But what should be thrown in a non-hotword > recognition > > when the > > >>> timer fires and there is no match? It can't be 001 > > no-match; the > > >>> VoiceXML interpreter will not be able to > distinguish a > > "nomatch" > > >>> from "maxspeech" in this case. > > >>> > > >>> Let's take a step back. If we really want > to describe > > what happened, > > >>> then I think we need to communicate to the client > > 1) why the > > >>> recognition was stopped and 2) if the result a > > no-match, a partial > > >>> match, or a complete match when it stopped. > > >>> > > >>> 1) The recognition can stop for the > following reasons > > (the related > > >>> cause codes in parens): > > >>> > > >>> no-input timeout (002) > > >>> complete timeout (000) > > >>> incomplete timeout (001 013) > > >>> recognition timeout (003 and 008) > > >>> speech too early (007) > > >>> cancelled (011) > > >>> various errors (004, 005, 006, 009, 010, 012) > > >>> > > >>> 2) The result is irrelevant for no-input, speech > > too early, > > >>> cancelled, and the errors. This leaves complete > > timeout, incomplete > > >>> timeout, and recognition timeout. By the > definition of > > complete > > >>> timeout, this is always a complete match > (000). Also > > by definition, > > >>> incomplete timeout could result in a no > match (001) or > > a partial > > >>> match (013) -- though VoiceXML will likely throw > > nomatch for either. > > >>> > > >>> So we are left with recognition timeout. It > could be a > > complete > > >>> match > > >>> (008 by the definition in section 9.4.11) > or a partial > > match (008 by > > >>> the definition in section 9.4.7), or a no-match... > > well if it was a > > >>> hotword recognition we have 003; for the > normal case, > > we have nothing. > > >>> > > >>> So hopefully I have clarified my earlier point that > > Recognition > > >>> Complete Cause codes 003 and 008 need revisiting. > > Sorry to be so > > >>> long winded but I think it's important that > VoiceXML be > > >>> implementable using MRCPv2; it is a very common use > > case for the specification. > > >>> > > >>> Before proposing a solution, I would like > to again ask > > why we need > > >>> to have a special cause code for > recognition-timeout > > in the hotword > > >>> case. The answer to this question is here: > > >>> > > > http://www1.ietf.org/mail-archive/web/speechsc/current/msg0 > > 1501.html > > >>> > > >>> But I still don't see why you need a special > > completion cause to say > > >>> that the timer expired in hotword mode. The client > > knows what mode > > >>> the recognition is in. You just need to > tell it that > > the timer fired. > > >>> > > >>> What the client does need to know is if there is a > > valid result for > > >>> it to process. Thus I propose that 003 be > > rewritten as follows: > > >>> > > >>> 003 recognition-timeout: RECOGNIZE in hotword mode > > completed without > > >>> a match due to a recognition-timeout > > >>> > > >>> This will bring it back in line with the earlier > > Scansoft & Nuance > > >>> proposal (well now it just a Nuance > proposal) from a > > year and a half > > >>> ago: > > >>> > > > http://www1.ietf.org/mail-archive/web/speechsc/current/msg0 > > 0560.html > > >>> > > >>> Andrew > > >>> > > >>> _______________________________________________ > > >>> Speechsc mailing list > > >>> Speechsc@ietf.org > > >>> https://www1.ietf.org/mailman/listinfo/speechsc > > >>> > > >>> > > > > > > > > > > ----------------------------------------------------------- > > --------------------- > > > > > > > _______________________________________________ > > > Speechsc mailing list > > > Speechsc@ietf.org > > > https://www1.ietf.org/mailman/listinfo/speechsc > > > > > > > > > _______________________________________________ > > Speechsc mailing list > > Speechsc@ietf.org > > https://www1.ietf.org/mailman/listinfo/speechsc > > > > > > _______________________________________________ > > Speechsc mailing list > > Speechsc@ietf.org > > https://www1.ietf.org/mailman/listinfo/speechsc > > > > > > > > > > > > > _______________________________________________ > Speechsc mailing list > Speechsc@ietf.org > https://www1.ietf.org/mailman/listinfo/speechsc _______________________________________________ Speechsc mailing list Speechsc@ietf.org https://www1.ietf.org/mailman/listinfo/speechsc From speechsc-bounces@ietf.org Tue Nov 15 11:59:34 2005 Received: from localhost.cnri.reston.va.us ([127.0.0.1] helo=megatron.ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1Ec49e-0001kn-85; Tue, 15 Nov 2005 11:59:34 -0500 Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1Ec49c-0001kL-VQ for speechsc@megatron.ietf.org; Tue, 15 Nov 2005 11:59:32 -0500 Received: from ietf-mx.ietf.org (ietf-mx [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id LAA08139 for ; Tue, 15 Nov 2005 11:58:55 -0500 (EST) Received: from maile.telecomitalia.it ([156.54.233.31]) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1Ec4Qp-0002ni-GQ for speechsc@ietf.org; Tue, 15 Nov 2005 12:17:19 -0500 Received: from ptpxch009ba020.idc.cww.telecomitalia.it ([156.54.240.52]) by maile.telecomitalia.it with Microsoft SMTPSVC(6.0.3790.211); Tue, 15 Nov 2005 17:59:10 +0100 Received: from ptpevs009ba020.idc.cww.telecomitalia.it ([156.54.240.187]) by ptpxch009ba020.idc.cww.telecomitalia.it with Microsoft SMTPSVC(6.0.3790.211); Tue, 15 Nov 2005 17:59:10 +0100 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.3790.326 Content-Class: urn:content-classes:message MIME-Version: 1.0 Priority: normal Importance: normal Subject: [Speechsc] Recording size and duration: open issue Date: Tue, 15 Nov 2005 17:58:11 +0100 Message-ID: Thread-Topic: [Speechsc] Recording size and duration: open issue thread-index: AcXqBcMEULhVUtpPRkSJ4hayHCZyYA== From: "Bergallo Patrizio" To: X-OriginalArrivalTime: 15 Nov 2005 16:59:10.0046 (UTC) FILETIME=[E5C387E0:01C5EA05] X-Spam-Score: 0.0 (/) X-Scan-Signature: 73734d43604d52d23b3eba644a169745 X-BeenThere: speechsc@ietf.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Speech Services Control Working Group List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Content-Type: multipart/mixed; boundary="===============1491004952==" Sender: speechsc-bounces@ietf.org Errors-To: speechsc-bounces@ietf.org This is a multi-part message in MIME format. --===============1491004952== Content-Type: multipart/alternative; boundary="----=_NextPart_000_1EFA70_01C5EA0E.47A059E0" Content-Class: urn:content-classes:message This is a multi-part message in MIME format. ------=_NextPart_000_1EFA70_01C5EA0E.47A059E0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Hello group, in a previous thread (http://www1.ietf.org/mail-archive/web/speechsc/current/msg01531.html) there was a discussion about size and duration of recorded audio. In draft 08, size and duration values have been added to the recog-only "waveform-uri" header (9.4.8), but to fully support a VoiceXML 2.0 interpreter implementation, these values should be added to the recorder "record-uri" header (10.4.7) too, I suppose with same syntax and semantics. In fact they directly map the size and duration record shadow variable of the VoiceXML 2.0 record element (2.3.6 of VoiceXML 2.0 recommendation). Furthermore, although not yet in the VoiceXML recommendation, I think these values should be added to the verification "waveform-uri" header (11.4.12) too. Any thought? Patrizio Bergallo, Loquendo. Gruppo Telecom Italia - Direzione e coordinamento di Telecom Italia = S.p.A. =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D CONFIDENTIALITY NOTICE This message and its attachments are addressed solely to the persons = above and may contain confidential information. If you have received the = message in error, be informed that any use of the content hereof is = prohibited. Please return it immediately to the sender and delete the = message. Should you have any questions, please send an e_mail to = webmaster@telecomitalia.it. Thank = youwww.loquendo.com =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D ------=_NextPart_000_1EFA70_01C5EA0E.47A059E0 Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable

Hello = group,

in a previous = thread
(http://www1.ietf.org/mail-archive/web/speechsc/current/msg0153= 1.html)
there was a discussion about size and duration of recorded = audio.
In draft 08, size and duration values have been added to the = recog-only
"waveform-uri" header (9.4.8),
but to fully support a = VoiceXML 2.0 interpreter implementation, these
values should be added = to the recorder "record-uri" header (10.4.7) too,
I suppose with same = syntax and semantics. In fact they directly map the
size and duration = record shadow variable of the VoiceXML 2.0 record
element (2.3.6 of = VoiceXML 2.0 recommendation).
Furthermore, although not yet in the = VoiceXML recommendation, I think
these values should be added to the = verification "waveform-uri" header
(11.4.12) too.

Any = thought?

Patrizio Bergallo, Loquendo.

Gruppo=20 Telecom Italia - Direzione e coordinamento di Telecom Italia = S.p.A.

=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D
CONFIDENTIALITY=20 NOTICE
This message and its attachments are addressed solely to the=20 persons
above and may contain confidential information. If you have=20 received
the message in error, be informed that any use of the = content=20 hereof
is prohibited. Please return it immediately to the sender and=20 delete
the message. Should you have any questions, please send an = e_mail=20 to
<mailto:webmaster@telecomitalia= .it>webmaster@telecomitalia.it.=20 Thank you
<http://www.loquendo.com>www.loque= ndo.com
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D

------=_NextPart_000_1EFA70_01C5EA0E.47A059E0-- --===============1491004952== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Speechsc mailing list Speechsc@ietf.org https://www1.ietf.org/mailman/listinfo/speechsc --===============1491004952==-- From speechsc-bounces@ietf.org Tue Nov 15 12:21:52 2005 Received: from localhost.cnri.reston.va.us ([127.0.0.1] helo=megatron.ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1Ec4VE-0008O2-4d; Tue, 15 Nov 2005 12:21:52 -0500 Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1Ec4VC-0008Mu-RC for speechsc@megatron.ietf.org; Tue, 15 Nov 2005 12:21:50 -0500 Received: from ietf-mx.ietf.org (ietf-mx [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id MAA10780 for ; Tue, 15 Nov 2005 12:21:17 -0500 (EST) Received: from maild.telecomitalia.it ([156.54.233.30]) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1Ec4mS-000410-Nz for speechsc@ietf.org; Tue, 15 Nov 2005 12:39:42 -0500 Received: from ptpxch008ba020.idc.cww.telecomitalia.it ([156.54.240.51]) by maild.telecomitalia.it with Microsoft SMTPSVC(6.0.3790.211); Tue, 15 Nov 2005 18:21:56 +0100 Received: from ptpevs009ba020.idc.cww.telecomitalia.it ([156.54.240.187]) by ptpxch008ba020.idc.cww.telecomitalia.it with Microsoft SMTPSVC(6.0.3790.211); Tue, 15 Nov 2005 18:21:44 +0100 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.3790.326 Content-Class: urn:content-classes:message Importance: normal Priority: normal MIME-Version: 1.0 Subject: [Speechsc] Recognition result format Date: Tue, 15 Nov 2005 18:20:45 +0100 Message-ID: Thread-Topic: [Speechsc] Recognition result format thread-index: AcXqCOoR0veDF6JwRC69bWEjXLvqKw== From: "Bergallo Patrizio" To: X-OriginalArrivalTime: 15 Nov 2005 17:21:44.0930 (UTC) FILETIME=[0D566C20:01C5EA09] X-Spam-Score: 0.0 (/) X-Scan-Signature: 00e94c813bef7832af255170dca19e36 X-BeenThere: speechsc@ietf.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Speech Services Control Working Group List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Content-Type: multipart/mixed; boundary="===============0563773771==" Sender: speechsc-bounces@ietf.org Errors-To: speechsc-bounces@ietf.org This is a multi-part message in MIME format. --===============0563773771== Content-Type: multipart/alternative; boundary="----=_NextPart_000_326157_01C5EA11.75EEB910" Content-Class: urn:content-classes:message This is a multi-part message in MIME format. ------=_NextPart_000_326157_01C5EA11.75EEB910 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Hello group, from the Vancouver meeting minutes and slides I don't understand which decision has been taken about the way to specify the recog result format (to allow a simpler transition from NLSML to EMMA in the future). >From last group discussions, I remember there were the "accept header" way and the "SIP End Point capability" way. Is there a plan to address this issue in the 09 draft? Could you please clarify this point, thank you. Regards, Patrizio Bergallo, Loquendo. =20 Gruppo Telecom Italia - Direzione e coordinamento di Telecom Italia = S.p.A. =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D CONFIDENTIALITY NOTICE This message and its attachments are addressed solely to the persons = above and may contain confidential information. If you have received the = message in error, be informed that any use of the content hereof is = prohibited. Please return it immediately to the sender and delete the = message. Should you have any questions, please send an e_mail to = webmaster@telecomitalia.it. Thank = youwww.loquendo.com =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D ------=_NextPart_000_326157_01C5EA11.75EEB910 Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable

Hello = group,

from the Vancouver meeting minutes and slides I don't = understand which
decision has been taken about the way to specify the = recog result format
(to allow a simpler transition from NLSML to EMMA = in the future).
From last group discussions, I remember there were = the "accept header"
way and the "SIP End Point capability" way.
Is = there a plan to address this issue in the 09 draft?
Could you please = clarify this point, thank you.

Regards,
Patrizio Bergallo, = Loquendo.

Gruppo=20 Telecom Italia - Direzione e coordinamento di Telecom Italia = S.p.A.

=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D
CONFIDENTIALITY=20 NOTICE
This message and its attachments are addressed solely to the=20 persons
above and may contain confidential information. If you have=20 received
the message in error, be informed that any use of the = content=20 hereof
is prohibited. Please return it immediately to the sender and=20 delete
the message. Should you have any questions, please send an = e_mail=20 to
<mailto:webmaster@telecomitalia= .it>webmaster@telecomitalia.it.=20 Thank you
<http://www.loquendo.com>www.loque= ndo.com
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D

------=_NextPart_000_326157_01C5EA11.75EEB910-- --===============0563773771== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Speechsc mailing list Speechsc@ietf.org https://www1.ietf.org/mailman/listinfo/speechsc --===============0563773771==-- From speechsc-bounces@ietf.org Tue Nov 15 12:40:14 2005 Received: from localhost.cnri.reston.va.us ([127.0.0.1] helo=megatron.ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1Ec4n0-0007MF-Ll; Tue, 15 Nov 2005 12:40:14 -0500 Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1Ec4mz-0007Ky-JF for speechsc@megatron.ietf.org; Tue, 15 Nov 2005 12:40:13 -0500 Received: from ietf-mx.ietf.org (ietf-mx [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id MAA11842 for ; Tue, 15 Nov 2005 12:39:40 -0500 (EST) Received: from sj-iport-3-in.cisco.com ([171.71.176.72] helo=sj-iport-3.cisco.com) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1Ec54G-0004Z5-C5 for speechsc@ietf.org; Tue, 15 Nov 2005 12:58:05 -0500 Received: from sj-core-4.cisco.com ([171.68.223.138]) by sj-iport-3.cisco.com with ESMTP; 15 Nov 2005 09:40:04 -0800 X-IronPort-AV: i="3.97,333,1125903600"; d="scan'208,217"; a="365479058:sNHT39643580" Received: from vtg-um-e2k6.sj21ad.cisco.com (vtg-um-e2k6.cisco.com [171.70.93.77]) by sj-core-4.cisco.com (8.12.10/8.12.6) with ESMTP id jAFHdx3r000209; Tue, 15 Nov 2005 09:40:01 -0800 (PST) X-MimeOLE: Produced By Microsoft Exchange V6.5.7226.0 Content-class: urn:content-classes:message MIME-Version: 1.0 Subject: RE: [Speechsc] Recording size and duration: open issue Date: Tue, 15 Nov 2005 09:40:00 -0800 Message-ID: <03772D1EC8DE624A863058C75874A75C73A726@vtg-um-e2k6.sj21ad.cisco.com> Thread-Topic: [Speechsc] Recording size and duration: open issue Thread-Index: AcXqBcMEULhVUtpPRkSJ4hayHCZyYAABZ7WQ From: "Shanmugham, Saravanan" To: "Bergallo Patrizio" , X-Spam-Score: 0.6 (/) X-Scan-Signature: 33cc095b503da4365ce57c727e553cf1 Cc: X-BeenThere: speechsc@ietf.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Speech Services Control Working Group List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Content-Type: multipart/mixed; boundary="===============0603377064==" Sender: speechsc-bounces@ietf.org Errors-To: speechsc-bounces@ietf.org This is a multi-part message in MIME format. --===============0603377064== Content-class: urn:content-classes:message Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01C5EA0B.9AADEAAE" This is a multi-part message in MIME format. ------_=_NextPart_001_01C5EA0B.9AADEAAE Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable That was the intent. The definition was not meant for Recognition only. =20 I will clarify. =20 Thanks, Sarvi ________________________________ From: speechsc-bounces@ietf.org [mailto:speechsc-bounces@ietf.org] On Behalf Of Bergallo Patrizio Sent: Tuesday, November 15, 2005 8:58 AM To: speechsc@ietf.org Subject: [Speechsc] Recording size and duration: open issue =09 =09 =20 Hello group, =09 in a previous thread =09 (http://www1.ietf.org/mail-archive/web/speechsc/current/msg01531.html) there was a discussion about size and duration of recorded audio. In draft 08, size and duration values have been added to the recog-only "waveform-uri" header (9.4.8), but to fully support a VoiceXML 2.0 interpreter implementation, these values should be added to the recorder "record-uri" header (10.4.7) too, I suppose with same syntax and semantics. In fact they directly map the size and duration record shadow variable of the VoiceXML 2.0 record element (2.3.6 of VoiceXML 2.0 recommendation). Furthermore, although not yet in the VoiceXML recommendation, I think these values should be added to the verification "waveform-uri" header (11.4.12) too. =09 Any thought? =09 Patrizio Bergallo, Loquendo. =09 Gruppo Telecom Italia - Direzione e coordinamento di Telecom Italia S.p.A. =09 = =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D CONFIDENTIALITY NOTICE This message and its attachments are addressed solely to the persons above and may contain confidential information. If you have received the message in error, be informed that any use of the content hereof is prohibited. Please return it immediately to the sender and delete the message. Should you have any questions, please send an e_mail to webmaster@telecomitalia.it. Thank you www.loquendo.com = =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D =09 =09 ------_=_NextPart_001_01C5EA0B.9AADEAAE Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable

That was the intent. The = definition was=20 not meant for Recognition only.

I will clarify.

Thanks,

Sarvi

From: speechsc-bounces@ietf.org=20 [mailto:speechsc-bounces@ietf.org] On Behalf Of Bergallo=20 Patrizio
Sent: Tuesday, November 15, 2005 8:58 = AM
To:=20 speechsc@ietf.org
Subject: [Speechsc] Recording size and = duration:=20 open issue

Hello group,

in a previous=20 = thread
(http://www1.ietf.org/mail-archive/web/speechsc/current/msg0153= 1.html)
there=20 was a discussion about size and duration of recorded audio.
In = draft 08,=20 size and duration values have been added to the = recog-only
"waveform-uri"=20 header (9.4.8),
but to fully support a VoiceXML 2.0 interpreter=20 implementation, these
values should be added to the recorder = "record-uri"=20 header (10.4.7) too,
I suppose with same syntax and semantics. In = fact they=20 directly map the
size and duration record shadow variable of the = VoiceXML=20 2.0 record
element (2.3.6 of VoiceXML 2.0 = recommendation).
Furthermore,=20 although not yet in the VoiceXML recommendation, I think
these = values=20 should be added to the verification "waveform-uri" header
(11.4.12) = too.

Any thought?

Patrizio Bergallo, = Loquendo.

Gruppo=20 Telecom Italia - Direzione e coordinamento di Telecom Italia=20 S.p.A.

=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D
CONFIDENTIALITY=20 NOTICE
This message and its attachments are addressed solely to the = persons
above and may contain confidential information. If you have = received
the message in error, be informed that any use of the = content=20 hereof
is prohibited. Please return it immediately to the sender = and=20 delete
the message. Should you have any questions, please send an = e_mail=20 to
<mailto:webmaster@telecomitalia= .it>webmaster@telecomitalia.it.=20 Thank you
<http://www.loquendo.com>www.loque= ndo.com
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D

------_=_NextPart_001_01C5EA0B.9AADEAAE-- --===============0603377064== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Speechsc mailing list Speechsc@ietf.org https://www1.ietf.org/mailman/listinfo/speechsc --===============0603377064==-- From speechsc-bounces@ietf.org Tue Nov 15 12:44:13 2005 Received: from localhost.cnri.reston.va.us ([127.0.0.1] helo=megatron.ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1Ec4qr-0000Z6-4t; Tue, 15 Nov 2005 12:44:13 -0500 Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1Ec4qq-0000Ys-0J for speechsc@megatron.ietf.org; Tue, 15 Nov 2005 12:44:12 -0500 Received: from ietf-mx.ietf.org (ietf-mx [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id MAA12035 for ; Tue, 15 Nov 2005 12:43:38 -0500 (EST) Received: from sj-iport-5.cisco.com ([171.68.10.87]) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1Ec586-0004fY-Rv for speechsc@ietf.org; Tue, 15 Nov 2005 13:02:03 -0500 Received: from sj-core-3.cisco.com ([171.68.223.137]) by sj-iport-5.cisco.com with ESMTP; 15 Nov 2005 09:44:02 -0800 X-IronPort-AV: i="3.97,333,1125903600"; d="scan'208,217"; a="230674649:sNHT37222824" Received: from vtg-um-e2k6.sj21ad.cisco.com (vtg-um-e2k6.cisco.com [171.70.93.77]) by sj-core-3.cisco.com (8.12.10/8.12.6) with ESMTP id jAFHi0Iq004246; Tue, 15 Nov 2005 09:44:00 -0800 (PST) X-MimeOLE: Produced By Microsoft Exchange V6.5.7226.0 Content-class: urn:content-classes:message MIME-Version: 1.0 Subject: RE: [Speechsc] Recognition result format Date: Tue, 15 Nov 2005 09:43:59 -0800 Message-ID: <03772D1EC8DE624A863058C75874A75C73A728@vtg-um-e2k6.sj21ad.cisco.com> Thread-Topic: [Speechsc] Recognition result format Thread-Index: AcXqCOoR0veDF6JwRC69bWEjXLvqKwAAryYQ From: "Shanmugham, Saravanan" To: "Bergallo Patrizio" , X-Spam-Score: 0.6 (/) X-Scan-Signature: d890c9ddd0b0a61e8c597ad30c1c2176 Cc: X-BeenThere: speechsc@ietf.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Speech Services Control Working Group List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Content-Type: multipart/mixed; boundary="===============0993598641==" Sender: speechsc-bounces@ietf.org Errors-To: speechsc-bounces@ietf.org This is a multi-part message in MIME format. --===============0993598641== Content-class: urn:content-classes:message Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01C5EA0C.292009D4" This is a multi-part message in MIME format. ------_=_NextPart_001_01C5EA0C.292009D4 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable This was not decided in the meeting but on the alias. The decision was to add the Accept header similar to those found in SIP to the MRCP messages. This allows the client to specify to the server what it wants as result format. This can also be used in the SET-PARAMS/GET-PARAMS methods to the resource to set it this value so, that the client does not have to send it all the time. =20 Now the above, does not allow for registration/discovery and routing, for which this header is too late. For which we propose to use SIP endpoint capabilities, which would be described in separate draft. =20 Sarvi ________________________________ From: speechsc-bounces@ietf.org [mailto:speechsc-bounces@ietf.org] On Behalf Of Bergallo Patrizio Sent: Tuesday, November 15, 2005 9:21 AM To: speechsc@ietf.org Subject: [Speechsc] Recognition result format =09 =09 =20 Hello group, =09 from the Vancouver meeting minutes and slides I don't understand which decision has been taken about the way to specify the recog result format (to allow a simpler transition from NLSML to EMMA in the future). From last group discussions, I remember there were the "accept header" way and the "SIP End Point capability" way. Is there a plan to address this issue in the 09 draft? Could you please clarify this point, thank you. =09 Regards, Patrizio Bergallo, Loquendo. =09 =09 Gruppo Telecom Italia - Direzione e coordinamento di Telecom Italia S.p.A. =09 = =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D CONFIDENTIALITY NOTICE This message and its attachments are addressed solely to the persons above and may contain confidential information. If you have received the message in error, be informed that any use of the content hereof is prohibited. Please return it immediately to the sender and delete the message. Should you have any questions, please send an e_mail to webmaster@telecomitalia.it. Thank you www.loquendo.com = =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D =09 =09 ------_=_NextPart_001_01C5EA0C.292009D4 Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable

This was not decided in the meeting but on = the alias.=20 The decision was to add the Accept header similar to those found in SIP = to the=20 MRCP messages. This allows the client to specify to the server what it = wants as=20 result format. This can also be used in the SET-PARAMS/GET-PARAMS = methods to the=20 resource to set it this value so, that the client does not have to send = it all=20 the time.

Now the above, does not allow for=20 registration/discovery and routing, for which this header is too late. = For which=20 we propose to use SIP endpoint capabilities, which would be described in = separate draft.

Sarvi

From: speechsc-bounces@ietf.org=20 [mailto:speechsc-bounces@ietf.org] On Behalf Of Bergallo=20 Patrizio
Sent: Tuesday, November 15, 2005 9:21 = AM
To:=20 speechsc@ietf.org
Subject: [Speechsc] Recognition result=20 format

Hello group,

from the Vancouver meeting = minutes and=20 slides I don't understand which
decision has been taken about the = way to=20 specify the recog result format
(to allow a simpler transition from = NLSML=20 to EMMA in the future).
From last group discussions, I remember = there were=20 the "accept header"
way and the "SIP End Point capability" = way.
Is there=20 a plan to address this issue in the 09 draft?
Could you please = clarify this=20 point, thank you.

Regards,
Patrizio Bergallo,=20 Loquendo.

Gruppo Telecom Italia - Direzione e coordinamento = di=20 Telecom Italia S.p.A.

=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D
CONFIDENTIALITY=20 NOTICE
This message and its attachments are addressed solely to the = persons
above and may contain confidential information. If you have = received
the message in error, be informed that any use of the = content=20 hereof
is prohibited. Please return it immediately to the sender = and=20 delete
the message. Should you have any questions, please send an = e_mail=20 to
<mailto:webmaster@telecomitalia= .it>webmaster@telecomitalia.it.=20 Thank you
<http://www.loquendo.com>www.loque= ndo.com
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D

------_=_NextPart_001_01C5EA0C.292009D4-- --===============0993598641== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Speechsc mailing list Speechsc@ietf.org https://www1.ietf.org/mailman/listinfo/speechsc --===============0993598641==-- From speechsc-bounces@ietf.org Wed Nov 16 07:34:40 2005 Received: from localhost.cnri.reston.va.us ([127.0.0.1] helo=megatron.ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1EcMUq-0006QP-Fv; Wed, 16 Nov 2005 07:34:40 -0500 Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1EcMUp-0006QH-3W for speechsc@megatron.ietf.org; Wed, 16 Nov 2005 07:34:39 -0500 Received: from ietf-mx.ietf.org (ietf-mx [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id HAA21600 for ; Wed, 16 Nov 2005 07:34:05 -0500 (EST) Received: from mail.vocalocity.net ([38.116.10.177] helo=smtp.vocalocity.net) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1EcMmF-0006HM-9Z for speechsc@ietf.org; Wed, 16 Nov 2005 07:52:40 -0500 Received: by smtp.vocalocity.net (Postfix, from userid 9999) id A2BB316CDE3; Wed, 16 Nov 2005 07:34:16 -0500 (EST) X-MimeOLE: Produced By Microsoft Exchange V6.5.7226.0 Content-class: urn:content-classes:message MIME-Version: 1.0 Date: Wed, 16 Nov 2005 07:34:15 -0500 Message-ID: <92E86BBD06161E4299A56009516472EDB31952@gates.vcorp.vocalocity.net> Thread-Topic: Minor editorial fix in -08 Thread-Index: AcXqnNackVn4JNYhRCWnOb7PwdO1tQ== From: "Jeff Haynie" To: X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on revelation.vcorp.vocalocity.net X-Spam-Status: No, score=-5.7 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00, HTML_MESSAGE autolearn=ham version=3.0.4 X-Spam-Score: 0.5 (/) X-Scan-Signature: 39bd8f8cbb76cae18b7e23f7cf6b2b9f Subject: [Speechsc] Minor editorial fix in -08 X-BeenThere: speechsc@ietf.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Speech Services Control Working Group List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Content-Type: multipart/mixed; boundary="===============1743017193==" Sender: speechsc-bounces@ietf.org Errors-To: speechsc-bounces@ietf.org This is a multi-part message in MIME format. --===============1743017193== Content-class: urn:content-classes:message Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01C5EAAA.0E1F6AD3" This is a multi-part message in MIME format. ------_=_NextPart_001_01C5EAAA.0E1F6AD3 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable 9.4.24 last sentence is missing "r" in number: =20 This helps in a numbe of use cases, including where the client wishes to reuse an open recognition session with an existing media session for multiple telephone calls. ------_=_NextPart_001_01C5EAAA.0E1F6AD3 Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable

9.4.24 last=20 sentence is missing "r" in number:

This helps in a numbe of use = cases,=20 including where the client wishes to reuse an open recognition session = with an=20 existing media session for multiple telephone = calls.

------_=_NextPart_001_01C5EAAA.0E1F6AD3-- --===============1743017193== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Speechsc mailing list Speechsc@ietf.org https://www1.ietf.org/mailman/listinfo/speechsc --===============1743017193==-- From speechsc-bounces@ietf.org Thu Nov 17 10:09:31 2005 Received: from localhost.cnri.reston.va.us ([127.0.0.1] helo=megatron.ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1EclOF-0007py-Pt; Thu, 17 Nov 2005 10:09:31 -0500 Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1EclOE-0007pt-LW for speechsc@megatron.ietf.org; Thu, 17 Nov 2005 10:09:30 -0500 Received: from ietf-mx.ietf.org (ietf-mx [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id KAA29203 for ; Thu, 17 Nov 2005 10:08:57 -0500 (EST) Received: from mailf.telecomitalia.it ([156.54.233.32]) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1Eclft-0005Kd-2i for speechsc@ietf.org; Thu, 17 Nov 2005 10:27:46 -0500 Received: from ptpxch009ba020.idc.cww.telecomitalia.it ([156.54.240.52]) by mailf.telecomitalia.it with Microsoft SMTPSVC(6.0.3790.211); Thu, 17 Nov 2005 16:09:13 +0100 Received: from ptpevs009ba020.idc.cww.telecomitalia.it ([156.54.240.187]) by ptpxch009ba020.idc.cww.telecomitalia.it with Microsoft SMTPSVC(6.0.3790.211); Thu, 17 Nov 2005 16:09:13 +0100 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.3790.326 Content-Class: urn:content-classes:message Importance: normal MIME-Version: 1.0 Priority: normal Subject: [Speechsc] INTERPRET and Interpret-Text mismatch Date: Thu, 17 Nov 2005 16:08:11 +0100 Message-ID: Thread-Topic: [Speechsc] INTERPRET and Interpret-Text mismatch thread-index: AcXriLmSnyNVndTcTkCf3dBRkdgi8g== From: "Bergallo Patrizio" To: X-OriginalArrivalTime: 17 Nov 2005 15:09:13.0731 (UTC) FILETIME=[DEE13D30:01C5EB88] X-Spam-Score: 0.0 (/) X-Scan-Signature: 6d95a152022472c7d6cdf886a0424dc6 X-BeenThere: speechsc@ietf.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Speech Services Control Working Group List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Content-Type: multipart/mixed; boundary="===============0548511951==" Sender: speechsc-bounces@ietf.org Errors-To: speechsc-bounces@ietf.org This is a multi-part message in MIME format. --===============0548511951== Content-Type: multipart/alternative; boundary="----=_NextPart_000_23741B_01C5EB91.4075E2A0" Content-Class: urn:content-classes:message This is a multi-part message in MIME format. ------=_NextPart_000_23741B_01C5EB91.4075E2A0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: quoted-printable Hi group,=20 a couple of comments on the INTERPRET method: 1) In the Interpret-Text header description (see page 79 of draft 08) is said that: "The value is a content-id that refers to a MIME entity of type plain/text in the body of the message." In the INTERPRET method description (see page 112) is said that: "The INTERPRET method (...) takes as input an interpret-text header containing the text for which the semantic interpretation is desired" and "Recognizer grammar data is treated in the same way as it is when issuing a RECOGNIZE method call." The example is according to method description. Is there any reason for which the MIME entity mentioned in the header description is needed?=20 If yes, we think that should be clarified that the INTERPRET body will contains grammar data and text to be interpreted. If no, we think that the Interpret-Text value can be the text to be interpreted, without using Content-Id. 2) In the two examples of INTERPRET usage (see pages 112 and 114), when server sends result to client, it uses a response message instead of INTERPRETATION-COMPLETE event. Regards, Patrizio Bergallo, Loquendo. Gruppo Telecom Italia - Direzione e coordinamento di Telecom Italia = S.p.A. =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D CONFIDENTIALITY NOTICE This message and its attachments are addressed solely to the persons = above and may contain confidential information. If you have received the = message in error, be informed that any use of the content hereof is = prohibited. Please return it immediately to the sender and delete the = message. Should you have any questions, please send an e_mail to = webmaster@telecomitalia.it. Thank = youwww.loquendo.com =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D ------=_NextPart_000_23741B_01C5EB91.4075E2A0 Content-Type: text/html; charset="US-ASCII" Content-Transfer-Encoding: quoted-printable

Hi group,
a = couple of comments on the INTERPRET method:

1)
In the = Interpret-Text header description (see page 79 of draft 08) is
said = that:
"The value is a
content-id that refers to a MIME = entity of type plain/text in the
body of the message."

In = the INTERPRET method description (see page 112) is said that:
"The = INTERPRET method (...) takes as input an
interpret-text header = containing the text for which the semantic
interpretation is = desired"
and
"Recognizer grammar data is treated in the same = way as it is when
issuing a RECOGNIZE method call."

The = example is according to method description.

Is there any reason = for which the MIME entity mentioned in the header
description is = needed?
If yes, we think that should be clarified that the INTERPRET = body will
contains grammar data and text to be interpreted. If no, we = think that
the Interpret-Text value can be the text to be = interpreted, without
using Content-Id.

2)
In the two = examples of INTERPRET usage (see pages 112 and 114), when
server = sends result to client, it uses a response message instead = of
INTERPRETATION-COMPLETE event.

Regards,
Patrizio = Bergallo, Loquendo.

Gruppo=20 Telecom Italia - Direzione e coordinamento di Telecom Italia = S.p.A.

=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D
CONFIDENTIALITY=20 NOTICE
This message and its attachments are addressed solely to the=20 persons
above and may contain confidential information. If you have=20 received
the message in error, be informed that any use of the = content=20 hereof
is prohibited. Please return it immediately to the sender and=20 delete
the message. Should you have any questions, please send an = e_mail=20 to
<mailto:webmaster@telecomitalia= .it>webmaster@telecomitalia.it.=20 Thank you
<http://www.loquendo.com>www.loque= ndo.com
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D

------=_NextPart_000_23741B_01C5EB91.4075E2A0-- --===============0548511951== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Speechsc mailing list Speechsc@ietf.org https://www1.ietf.org/mailman/listinfo/speechsc --===============0548511951==-- From speechsc-bounces@ietf.org Mon Nov 28 13:00:17 2005 Received: from localhost.cnri.reston.va.us ([127.0.0.1] helo=megatron.ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1EgnIX-00051C-2m; Mon, 28 Nov 2005 13:00:17 -0500 Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1EgnIU-0004zp-RG for speechsc@megatron.ietf.org; Mon, 28 Nov 2005 13:00:15 -0500 Received: from ietf-mx.ietf.org (ietf-mx [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id MAA09738 for ; Mon, 28 Nov 2005 12:59:24 -0500 (EST) Received: from fw01.db01.voxpilot.com ([212.17.54.82] helo=mail.voxpilot.com) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1EgncF-0007QD-FB for speechsc@ietf.org; Mon, 28 Nov 2005 13:20:42 -0500 Received: from daburkewxp (dsl-34-34.dsl.netsource.ie [213.79.34.34]) by mail.voxpilot.com (Postfix) with ESMTP id B9379214046 for ; Mon, 28 Nov 2005 17:59:49 +0000 (GMT) Message-ID: <042701c5f445$8119f550$6600000a@db01.voxpilot.com> From: "Dave Burke" To: Date: Mon, 28 Nov 2005 17:59:40 -0000 MIME-Version: 1.0 X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2900.2180 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.2180 X-Spam-Score: 0.2 (/) X-Scan-Signature: 769a46790fb42fbb0b0cc700c82f7081 Subject: [Speechsc] Clarification on message-length X-BeenThere: speechsc@ietf.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Speech Services Control Working Group List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Content-Type: multipart/mixed; boundary="===============1918340819==" Sender: speechsc-bounces@ietf.org Errors-To: speechsc-bounces@ietf.org This is a multi-part message in MIME format. --===============1918340819== Content-Type: multipart/alternative; boundary="----=_NextPart_000_0424_01C5F445.80CF57C0" This is a multi-part message in MIME format. ------=_NextPart_000_0424_01C5F445.80CF57C0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Does message-length include the message-body or does it only pertain to = the start line and header fields? Not clear from the text (and the = examples don't appear to have the correct message-length or = Content-Length). Thanks, Dave ------=_NextPart_000_0424_01C5F445.80CF57C0 Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable

Does message-length include the = message-body or=20 does it only pertain to the start line and header fields? Not clear from = the=20 text (and the examples don't appear to have the correct message-length = or=20 Content-Length).

Thanks,

Dave

------=_NextPart_000_0424_01C5F445.80CF57C0-- --===============1918340819== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Speechsc mailing list Speechsc@ietf.org https://www1.ietf.org/mailman/listinfo/speechsc --===============1918340819==--