From l2tpext-admin@ietf.org Tue Nov 4 11:12:38 2003 Received: from optimus.ietf.org (ietf.org [132.151.1.19] (may be forged)) by ietf.org (8.9.1a/8.9.1a) with ESMTP id LAA20741 for ; Tue, 4 Nov 2003 11:12:38 -0500 (EST) Received: from localhost.localdomain ([127.0.0.1] helo=www1.ietf.org) by optimus.ietf.org with esmtp (Exim 4.20) id 1AH3mi-0006L5-23; Tue, 04 Nov 2003 11:12:00 -0500 Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org) by optimus.ietf.org with esmtp (Exim 4.20) id 1AH3mU-0006Gs-SH for l2tpext@optimus.ietf.org; Tue, 04 Nov 2003 11:11:47 -0500 Received: from ietf-mx (ietf-mx.ietf.org [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id LAA20676 for ; Tue, 4 Nov 2003 11:11:32 -0500 (EST) Received: from ietf-mx ([132.151.6.1]) by ietf-mx with esmtp (Exim 4.12) id 1AH3mS-0007kX-00 for L2tpext@ietf.org; Tue, 04 Nov 2003 11:11:44 -0500 Received: from natint2.juniper.net ([207.17.136.150] helo=alpha.jnpr.net) by ietf-mx with esmtp (Exim 4.12) id 1AH3mR-0007je-00 for L2tpext@ietf.org; Tue, 04 Nov 2003 11:11:43 -0500 Received: from pi-smtp.jnpr.net ([10.10.2.36]) by alpha.jnpr.net with Microsoft SMTPSVC(6.0.3790.0); Tue, 4 Nov 2003 08:11:12 -0800 Received: from juniper.net ([10.10.248.157] RDNS failed) by pi-smtp.jnpr.net with Microsoft SMTPSVC(5.0.2195.6713); Tue, 4 Nov 2003 11:11:10 -0500 Message-ID: <3FA7CEEE.9000209@juniper.net> Date: Tue, 04 Nov 2003 11:08:14 -0500 From: Paul Howard User-Agent: Mozilla/5.0 (Windows; U; WinNT4.0; en-US; rv:1.0.2) Gecko/20030208 Netscape/7.02 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Vipin Jain CC: l2tp ietflist , Keyur Parikh , vipin@riverstonenet.com References: <20031031202540.81205.qmail@web41310.mail.yahoo.com> Content-Type: text/html; charset=us-ascii Content-Transfer-Encoding: 7bit X-OriginalArrivalTime: 04 Nov 2003 16:11:10.0619 (UTC) FILETIME=[42FB56B0:01C3A2EE] Content-Transfer-Encoding: 7bit Subject: [L2tpext] Re: Questions about draft-ietf-l2tpext-failover-02.txt Sender: l2tpext-admin@ietf.org Errors-To: l2tpext-admin@ietf.org X-BeenThere: l2tpext@ietf.org X-Mailman-Version: 2.0.12 Precedence: bulk List-Unsubscribe: , List-Id: Layer Two Tunneling Protocol Extensions List-Post: List-Help: List-Subscribe: , Content-Transfer-Encoding: 7bit Vipin,

See [pwh4] inline

Thanx,

Paul

Vipin Jain wrote:
Paul,

my comments inline..

  
My comments inline [pwh3]
    

  
Usually the implementations, at least the ones I have seen, have the 
protocollayer riding on top of a reliable layer. The reliable layer does 
litmus testfor an incoming packet based on the sequence numbers, if it 
matches then theexpected sequence numbers are updated and packet is delivered
    

  
to the protocollayer to parse the packet. Similarly, while sending Ns is 
updated and packet isqueued for transmission.Therefore in order to ensure 
that transport layer notices the sequence numbermiss, it should be able to 
quickly check (instead of going through all AVPs) toknow if it needs to hand
over the message to protocol layer.Fields that we could play around with are 
Ns and Nr fields. We can't possiblyfiddle around with other fields. However, 
if we choose known Ns, Nr (i.e. bothzero) then we are making it too 
predictable to know the sequence numbers duringfailover. This is the same 
concern I had mentioned earlier, to solve that weagreed to choose sequence 
numbers that were diametrically opposite.
[pwh3] I agree with the issues about separation of the reliable layer andthe 
protocol layer.   You mention we can only play with Ns and Nr.   Whynot take 
one of the reserved bits in the header to signal a UI frame (Ns= 0, Nr = 
0)?   The reliable layer would handle all UI frames - thus avoidingthe issue 
of to pass up to the protocol layer or not.   The handshaking todo the 
failover could thus be done with a sequence of UI frames.   The 
authenticatedhandshaking with UI frames would include the new sequence 
numbers to be usedthus avoiding predictability of the new Ns
    
Reserved Bits: I am in for that. Based on the discussions earlier Mark and
we agreed that if we can get it done by using a control plane message exchange
there is no need to make a L2TP-header change for this.
[pwh4] I'm not sure how to interpret this comment.   I agree we don't want to make changes where we can avoid them, but this needs to be balanced against providing a resource efficient solution.   By creating the concept of unnumbered frames to do reliable layer signaling then we can have the reliable layer re-sync the control plane without requiring a separate tunnel to do the resync and without requiring a tid change.   IMHO, this would seem to justify taking a reserved bit to indicate UI.
Using Ns=0, Nr=0 Scheme: This is flawed because it might naturally coincide
with the sequence numbers of a tunnel thereby confusing this as an
acknowledgement for a packet sent earlier.
[pwh4] I don't think so.   With unnumbered frames, there is no Ns/Nr thus they are ignored.   I mentioned 0/0 only because they should probably be set to something.   The UI bit would be looked at by the reliable layer before it looks at Ns/Nr.   UI frames would be handled entirely by the reliable layer and not the protocol layer.
  

    
  
I mentioned my concerns for reusing the existing tunnel to transmit new 
controlpackets above. Given that, I'd be inclined to do something close to 
what Keyurand Leo proposed to achieve hitless dataplane failover mechanism:- 
Establish a new tunnel (let's call it Failover Tunnel) to reset Ns values(and
    

  
optionally the authentication information) on two endpoints for a 
giventunnel. Please note that authentication carried for a tunnel is 
different fromthe authentication used for 'Failover tunnel' itself. Once Ns, 
Nr values areset, two endpoints can start exchanging the control packets on 
the old tunnel.Sessions are recovered as mentioned in the mechanism today.
[pwh3] What would be the packet exchange here?   I presume we would havethe 3
    

  
way handshake to establish the failover tunnel and then packets todo the 
reset?   Or would they be carried on the 3 way handshake.
    
packet exchange:
- establish a new tunnel (called 'Failover Tunnel')
- exchange the new sequence numbers of the old tunnels for control plane
- Both ends would reset their sequence numbers based on the values exchanged on
'Failover Tunnel'.
- The 'Failover Tunnel' is torn down after al tunenls are considered recovered
(i.e. when they have been exchanged and agreed upon by both peers. Remaining
tunnels are torn down assuming they do not need recovery by at least one of the
endpoints.
  
[pwh4] So we're looking at SCCRQ, SCCRP, SCCCN, packets to exchange sequence numbers, and StopCCN for every failover tunnel?  
  
One negativeto this approach is that it would require the systems to have the
    

  
resourcesavailable to at least temporarily manage the additional tunnels - 
this mightimply being able to temporarily peak at double the normally 
supported numberof tunnels.   If the non-failed endpoint was at it's maximum 
number of tunnels,how would it know to make an exception for the failover 
tunnel setup?
    
And therefore we would RECOMMEND keeping space for at least tunnel for recovery
purposes. If it can't establish the tunnel, then it could retry 'x' times
before concluding the tunnel recovery mechanism failed. This is much better
than current proposal where we'd establish one new tunnel for every old tunnel.
  
[pwh4] If we want recovery to proceed expeditiously (and we really only have a maximum of 1 hello timeout plus max retransmit timeout), then we really need to be able to recovery tunnels in parallel.   This could be very resource intensive.   It doesn't seem likely that a large number of tunnels will necessarily be coming from the same peer.    I typically see not more than a small handful (say 3) tunnels coming from the same peer to allow for different service policies.   With this scenario, I'd need an additional 33% resources to recover the tunnels in parallel - that's a rather steep price to pay.
  
Howdoes the transition to the new sequence number occur? 
I  presume we handoff the new sequence numbers to the reliable layer and
it then purges it'sre-ordering rx queue, renumbers any outstanding,
transmits and immediatelyre-transmits them?  Any stale receives get 
automatically discarded as outof window.
    
- Renumbering outstanding transmits might create more unpredictability for no
benefit. Because control plane on the failed node would have lost the context
related to previous messages, it is better to flush everything off and start
with new sequence numbers.
[pwh4] I disagree.   Say the outstanding transmit is an ICCN.   The failed node may or may not have remembered sending the ICRP.   By re-sending the ICCN, we allow the failed node at least the option of continuing the setup.   The state machines at the protocol layer should already be capable of handling an unexpected ICCN (or for that matter any unexpected packet).    If we were to throw away these packets, then we're potentially requiring additional complication at the protocol layer.   I don't know how your protocol layer is implemented, but mine treats the reliable layer as a pipe.   What I put into it is guaranteed to be delivered or I get a tunnel failed indication.   It doesn't seem that I'd really want a tunnel failed indication here, but that would be my only choice if the reliable layer threw away a packet that had been submitted for transmit.
The old tunnel becomes active only upon getting confirmation from its peer that
it has reset control plane sequence numbers. So this would have to be a three
way handshake as described below. For example, for an old tunnel:
- Failed node sends: "Reset Nr to 5665 for Old Local Tid = 9, Old Tid = 6".
- Non Failed node responds: "Nr for Old Tid = 9 reset to 5665, Reset Nr to 4435
for the same tunnel (i.e. Old Local Tid = 6, Old Tid = 9 from this node's
perspective)". Failed endpoint upon getting this message must first enqueue the
response of this message and then start sending control messages on the Old
tunnel.
- Failed node sends: "Nr for Old Tid = 9 reset to 4435". Non failed node upon
getting this can start sending control messages.
  
[pwh4] I presume in the case where control messages with the new sequence number space start arriving at the non-failed node before the final resync message, the messages get discarded as OOW.   The failed node will then re-transmit and they'll be accepted once the non-failed node gets the final resync message.
  
 It's not too hard to send a flood of SopCCNsto take down a tunnel even not 
knowing the sequence numbers.   Yes, a resetto 0 makes it more easier.   Your
    

  
suggestion below would help with this (andwith the UDP re-ordering problem 
above).   Posit that each side choses it'snew Ns in the 2nd and 3rd packets 
of the handshake.   Each side choses anNs that is away from recent prior 
traffic.   This restores upredictabilityfor the StopCCN scenario and 
addresses the UDP re-ordering.    
    
    
  
[pwh3] We might want the non-failed endpoint to provide a hint in the 
2ndpacket of the exchange to help the failed endpoint pick an Ns that is 
farenough away.   Might be useful if the failed endpoint's mirroring of 
it'sNs is relatively infrequent and there had been a lot of control traffic 
onthe tunnel.
    
Sure, picking our own Ns and scheme described above would take care of this
security hole.  
  
thanks,
-- vipin













__________________________________
Do you Yahoo!?
Exclusive Video Premiere - Britney Spears
http://launch.yahoo.com/promos/britneyspears/

  

_______________________________________________ L2tpext mailing list L2tpext@ietf.org https://www1.ietf.org/mailman/listinfo/l2tpext From exim@www1.ietf.org Tue Nov 4 11:12:41 2003 Received: from optimus.ietf.org (ietf.org [132.151.1.19] (may be forged)) by ietf.org (8.9.1a/8.9.1a) with ESMTP id LAA20756 for ; Tue, 4 Nov 2003 11:12:41 -0500 (EST) Received: from localhost.localdomain ([127.0.0.1] helo=www1.ietf.org) by optimus.ietf.org with esmtp (Exim 4.20) id 1AH3n4-0006N8-2W for l2tpext-archive@odin.ietf.org; Tue, 04 Nov 2003 11:12:23 -0500 Received: (from exim@localhost) by www1.ietf.org (8.12.8/8.12.8/Submit) id hA4GCMvL024490 for l2tpext-archive@odin.ietf.org; Tue, 4 Nov 2003 11:12:22 -0500 Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org) by optimus.ietf.org with esmtp (Exim 4.20) id 1AH3n3-0006Mv-7X for l2tpext-web-archive@optimus.ietf.org; Tue, 04 Nov 2003 11:12:21 -0500 Received: from ietf-mx (ietf-mx.ietf.org [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id LAA20723 for ; Tue, 4 Nov 2003 11:12:08 -0500 (EST) Received: from ietf-mx ([132.151.6.1]) by ietf-mx with esmtp (Exim 4.12) id 1AH3n2-0007lU-00 for l2tpext-web-archive@ietf.org; Tue, 04 Nov 2003 11:12:20 -0500 Received: from ietf.org ([132.151.1.19] helo=optimus.ietf.org) by ietf-mx with esmtp (Exim 4.12) id 1AH3n1-0007lQ-00 for l2tpext-web-archive@ietf.org; Tue, 04 Nov 2003 11:12:19 -0500 Received: from localhost.localdomain ([127.0.0.1] helo=www1.ietf.org) by optimus.ietf.org with esmtp (Exim 4.20) id 1AH3mi-0006L5-23; Tue, 04 Nov 2003 11:12:00 -0500 Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org) by optimus.ietf.org with esmtp (Exim 4.20) id 1AH3mU-0006Gs-SH for l2tpext@optimus.ietf.org; Tue, 04 Nov 2003 11:11:47 -0500 Received: from ietf-mx (ietf-mx.ietf.org [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id LAA20676 for ; Tue, 4 Nov 2003 11:11:32 -0500 (EST) Received: from ietf-mx ([132.151.6.1]) by ietf-mx with esmtp (Exim 4.12) id 1AH3mS-0007kX-00 for L2tpext@ietf.org; Tue, 04 Nov 2003 11:11:44 -0500 Received: from natint2.juniper.net ([207.17.136.150] helo=alpha.jnpr.net) by ietf-mx with esmtp (Exim 4.12) id 1AH3mR-0007je-00 for L2tpext@ietf.org; Tue, 04 Nov 2003 11:11:43 -0500 Received: from pi-smtp.jnpr.net ([10.10.2.36]) by alpha.jnpr.net with Microsoft SMTPSVC(6.0.3790.0); Tue, 4 Nov 2003 08:11:12 -0800 Received: from juniper.net ([10.10.248.157] RDNS failed) by pi-smtp.jnpr.net with Microsoft SMTPSVC(5.0.2195.6713); Tue, 4 Nov 2003 11:11:10 -0500 Message-ID: <3FA7CEEE.9000209@juniper.net> Date: Tue, 04 Nov 2003 11:08:14 -0500 From: Paul Howard User-Agent: Mozilla/5.0 (Windows; U; WinNT4.0; en-US; rv:1.0.2) Gecko/20030208 Netscape/7.02 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Vipin Jain CC: l2tp ietflist , Keyur Parikh , vipin@riverstonenet.com References: <20031031202540.81205.qmail@web41310.mail.yahoo.com> Content-Type: text/html; charset=us-ascii Content-Transfer-Encoding: 7bit X-OriginalArrivalTime: 04 Nov 2003 16:11:10.0619 (UTC) FILETIME=[42FB56B0:01C3A2EE] Content-Transfer-Encoding: 7bit Subject: [L2tpext] Re: Questions about draft-ietf-l2tpext-failover-02.txt Sender: l2tpext-admin@ietf.org Errors-To: l2tpext-admin@ietf.org X-BeenThere: l2tpext@ietf.org X-Mailman-Version: 2.0.12 Precedence: bulk List-Unsubscribe: , List-Id: Layer Two Tunneling Protocol Extensions List-Post: List-Help: List-Subscribe: , Content-Transfer-Encoding: 7bit Content-Transfer-Encoding: 7bit Vipin,

See [pwh4] inline

Thanx,

Paul

Vipin Jain wrote:
Paul,

my comments inline..

  
My comments inline [pwh3]
    

  
Usually the implementations, at least the ones I have seen, have the 
protocollayer riding on top of a reliable layer. The reliable layer does 
litmus testfor an incoming packet based on the sequence numbers, if it 
matches then theexpected sequence numbers are updated and packet is delivered
    

  
to the protocollayer to parse the packet. Similarly, while sending Ns is 
updated and packet isqueued for transmission.Therefore in order to ensure 
that transport layer notices the sequence numbermiss, it should be able to 
quickly check (instead of going through all AVPs) toknow if it needs to hand
over the message to protocol layer.Fields that we could play around with are 
Ns and Nr fields. We can't possiblyfiddle around with other fields. However, 
if we choose known Ns, Nr (i.e. bothzero) then we are making it too 
predictable to know the sequence numbers duringfailover. This is the same 
concern I had mentioned earlier, to solve that weagreed to choose sequence 
numbers that were diametrically opposite.
[pwh3] I agree with the issues about separation of the reliable layer andthe 
protocol layer.   You mention we can only play with Ns and Nr.   Whynot take 
one of the reserved bits in the header to signal a UI frame (Ns= 0, Nr = 
0)?   The reliable layer would handle all UI frames - thus avoidingthe issue 
of to pass up to the protocol layer or not.   The handshaking todo the 
failover could thus be done with a sequence of UI frames.   The 
authenticatedhandshaking with UI frames would include the new sequence 
numbers to be usedthus avoiding predictability of the new Ns
    
Reserved Bits: I am in for that. Based on the discussions earlier Mark and
we agreed that if we can get it done by using a control plane message exchange
there is no need to make a L2TP-header change for this.
[pwh4] I'm not sure how to interpret this comment.   I agree we don't want to make changes where we can avoid them, but this needs to be balanced against providing a resource efficient solution.   By creating the concept of unnumbered frames to do reliable layer signaling then we can have the reliable layer re-sync the control plane without requiring a separate tunnel to do the resync and without requiring a tid change.   IMHO, this would seem to justify taking a reserved bit to indicate UI.
Using Ns=0, Nr=0 Scheme: This is flawed because it might naturally coincide
with the sequence numbers of a tunnel thereby confusing this as an
acknowledgement for a packet sent earlier.
[pwh4] I don't think so.   With unnumbered frames, there is no Ns/Nr thus they are ignored.   I mentioned 0/0 only because they should probably be set to something.   The UI bit would be looked at by the reliable layer before it looks at Ns/Nr.   UI frames would be handled entirely by the reliable layer and not the protocol layer.
  

    
  
I mentioned my concerns for reusing the existing tunnel to transmit new 
controlpackets above. Given that, I'd be inclined to do something close to 
what Keyurand Leo proposed to achieve hitless dataplane failover mechanism:- 
Establish a new tunnel (let's call it Failover Tunnel) to reset Ns values(and
    

  
optionally the authentication information) on two endpoints for a 
giventunnel. Please note that authentication carried for a tunnel is 
different fromthe authentication used for 'Failover tunnel' itself. Once Ns, 
Nr values areset, two endpoints can start exchanging the control packets on 
the old tunnel.Sessions are recovered as mentioned in the mechanism today.
[pwh3] What would be the packet exchange here?   I presume we would havethe 3
    

  
way handshake to establish the failover tunnel and then packets todo the 
reset?   Or would they be carried on the 3 way handshake.
    
packet exchange:
- establish a new tunnel (called 'Failover Tunnel')
- exchange the new sequence numbers of the old tunnels for control plane
- Both ends would reset their sequence numbers based on the values exchanged on
'Failover Tunnel'.
- The 'Failover Tunnel' is torn down after al tunenls are considered recovered
(i.e. when they have been exchanged and agreed upon by both peers. Remaining
tunnels are torn down assuming they do not need recovery by at least one of the
endpoints.
  
[pwh4] So we're looking at SCCRQ, SCCRP, SCCCN, packets to exchange sequence numbers, and StopCCN for every failover tunnel?  
  
One negativeto this approach is that it would require the systems to have the
    

  
resourcesavailable to at least temporarily manage the additional tunnels - 
this mightimply being able to temporarily peak at double the normally 
supported numberof tunnels.   If the non-failed endpoint was at it's maximum 
number of tunnels,how would it know to make an exception for the failover 
tunnel setup?
    
And therefore we would RECOMMEND keeping space for at least tunnel for recovery
purposes. If it can't establish the tunnel, then it could retry 'x' times
before concluding the tunnel recovery mechanism failed. This is much better
than current proposal where we'd establish one new tunnel for every old tunnel.
  
[pwh4] If we want recovery to proceed expeditiously (and we really only have a maximum of 1 hello timeout plus max retransmit timeout), then we really need to be able to recovery tunnels in parallel.   This could be very resource intensive.   It doesn't seem likely that a large number of tunnels will necessarily be coming from the same peer.    I typically see not more than a small handful (say 3) tunnels coming from the same peer to allow for different service policies.   With this scenario, I'd need an additional 33% resources to recover the tunnels in parallel - that's a rather steep price to pay.
  
Howdoes the transition to the new sequence number occur? 
I  presume we handoff the new sequence numbers to the reliable layer and
it then purges it'sre-ordering rx queue, renumbers any outstanding,
transmits and immediatelyre-transmits them?  Any stale receives get 
automatically discarded as outof window.
    
- Renumbering outstanding transmits might create more unpredictability for no
benefit. Because control plane on the failed node would have lost the context
related to previous messages, it is better to flush everything off and start
with new sequence numbers.
[pwh4] I disagree.   Say the outstanding transmit is an ICCN.   The failed node may or may not have remembered sending the ICRP.   By re-sending the ICCN, we allow the failed node at least the option of continuing the setup.   The state machines at the protocol layer should already be capable of handling an unexpected ICCN (or for that matter any unexpected packet).    If we were to throw away these packets, then we're potentially requiring additional complication at the protocol layer.   I don't know how your protocol layer is implemented, but mine treats the reliable layer as a pipe.   What I put into it is guaranteed to be delivered or I get a tunnel failed indication.   It doesn't seem that I'd really want a tunnel failed indication here, but that would be my only choice if the reliable layer threw away a packet that had been submitted for transmit.
The old tunnel becomes active only upon getting confirmation from its peer that
it has reset control plane sequence numbers. So this would have to be a three
way handshake as described below. For example, for an old tunnel:
- Failed node sends: "Reset Nr to 5665 for Old Local Tid = 9, Old Tid = 6".
- Non Failed node responds: "Nr for Old Tid = 9 reset to 5665, Reset Nr to 4435
for the same tunnel (i.e. Old Local Tid = 6, Old Tid = 9 from this node's
perspective)". Failed endpoint upon getting this message must first enqueue the
response of this message and then start sending control messages on the Old
tunnel.
- Failed node sends: "Nr for Old Tid = 9 reset to 4435". Non failed node upon
getting this can start sending control messages.
  
[pwh4] I presume in the case where control messages with the new sequence number space start arriving at the non-failed node before the final resync message, the messages get discarded as OOW.   The failed node will then re-transmit and they'll be accepted once the non-failed node gets the final resync message.
  
 It's not too hard to send a flood of SopCCNsto take down a tunnel even not 
knowing the sequence numbers.   Yes, a resetto 0 makes it more easier.   Your
    

  
suggestion below would help with this (andwith the UDP re-ordering problem 
above).   Posit that each side choses it'snew Ns in the 2nd and 3rd packets 
of the handshake.   Each side choses anNs that is away from recent prior 
traffic.   This restores upredictabilityfor the StopCCN scenario and 
addresses the UDP re-ordering.    
    
    
  
[pwh3] We might want the non-failed endpoint to provide a hint in the 
2ndpacket of the exchange to help the failed endpoint pick an Ns that is 
farenough away.   Might be useful if the failed endpoint's mirroring of 
it'sNs is relatively infrequent and there had been a lot of control traffic 
onthe tunnel.
    
Sure, picking our own Ns and scheme described above would take care of this
security hole.  
  
thanks,
-- vipin













__________________________________
Do you Yahoo!?
Exclusive Video Premiere - Britney Spears
http://launch.yahoo.com/promos/britneyspears/

  

_______________________________________________ L2tpext mailing list L2tpext@ietf.org https://www1.ietf.org/mailman/listinfo/l2tpext From l2tpext-admin@ietf.org Tue Nov 4 23:07:23 2003 Received: from optimus.ietf.org (ietf.org [132.151.1.19] (may be forged)) by ietf.org (8.9.1a/8.9.1a) with ESMTP id XAA22241 for ; Tue, 4 Nov 2003 23:07:23 -0500 (EST) Received: from localhost.localdomain ([127.0.0.1] helo=www1.ietf.org) by optimus.ietf.org with esmtp (Exim 4.20) id 1AHEwh-0000NS-QY; Tue, 04 Nov 2003 23:07:03 -0500 Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org) by optimus.ietf.org with esmtp (Exim 4.20) id 1AHEvv-0000IE-3t for l2tpext@optimus.ietf.org; Tue, 04 Nov 2003 23:06:15 -0500 Received: from ietf-mx (ietf-mx.ietf.org [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id XAA22141 for ; Tue, 4 Nov 2003 23:06:00 -0500 (EST) Received: from ietf-mx ([132.151.6.1]) by ietf-mx with esmtp (Exim 4.12) id 1AHEvp-0003hm-00 for L2tpext@ietf.org; Tue, 04 Nov 2003 23:06:09 -0500 Received: from web41301.mail.yahoo.com ([66.218.93.186]) by ietf-mx with smtp (Exim 4.12) id 1AHEvn-0003hH-00 for L2tpext@ietf.org; Tue, 04 Nov 2003 23:06:08 -0500 Message-ID: <20031105040538.90291.qmail@web41301.mail.yahoo.com> Received: from [66.17.149.13] by web41301.mail.yahoo.com via HTTP; Tue, 04 Nov 2003 20:05:38 PST Date: Tue, 4 Nov 2003 20:05:38 -0800 (PST) From: Vipin Jain To: Paul Howard Cc: l2tp ietflist , Keyur Parikh , vipin@riverstonenet.com In-Reply-To: <3FA7CEEE.9000209@juniper.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Subject: [L2tpext] Re: Questions about draft-ietf-l2tpext-failover-02.txt Sender: l2tpext-admin@ietf.org Errors-To: l2tpext-admin@ietf.org X-BeenThere: l2tpext@ietf.org X-Mailman-Version: 2.0.12 Precedence: bulk List-Unsubscribe: , List-Id: Layer Two Tunneling Protocol Extensions List-Post: List-Help: List-Subscribe: , hi Paul, my response inline.. > Reserved Bits: I am in for that. Based on the discussions earlier Mark andwe > agreed that if we can get it done by using a control plane message > exchangethere is no need to make a L2TP-header change for this. > [pwh4] I'm not sure how to interpret this comment. I agree we don't wantto > make changes where we can avoid them, but this needs to be balanced > againstproviding a resource efficient solution. By creating the concept of > unnumberedframes to do reliable layer signaling then we can have the reliable > layerre-sync the control plane without requiring a separate tunnel to do the > resyncand without requiring a tid change. IMHO, this would seem to justify > takinga reserved bit to indicate UI. I don't think introducing the concept of having unnumbered messages and changing header bits is a good idea. It would be bring in following problems: - How do you ack an unumbered message? More importantly how do you relibaly transmit them? Do they take same transmit queue and apply with rx window constrains? - What if a node is bombarded with such unnumbered messages? Are we suppose to interpret each one of them? > Using Ns=0, Nr=0 Scheme: This is flawed because it might naturally > coincidewith the sequence numbers of a tunnel thereby confusing this as > anacknowledgement for a packet sent earlier. > [pwh4] I don't think so. With unnumbered frames, there is no Ns/Nr thusthey > are ignored. I mentioned 0/0 only because they should probably beset to > something. The UI bit would be looked at by the reliable layer beforeit > looks at Ns/Nr. UI frames would be handled entirely by the reliablelayer > and not the protocol layer. Using Ns=0, Nr=0 when UI bit is set invites interpreting any message with Nr=0, Ns=0 and UI bit set - Does it invite a DoS attack? > One negativeto this approach is that it would require the systems to have the > resourcesavailable to at least temporarily manage the additional tunnels - > this mightimply being able to temporarily peak at double the normally > supported numberof tunnels. If the non-failed endpoint was at it's maximum > number of tunnels,how would it know to make an exception for the failover > tunnel setup? > And therefore we would RECOMMEND keeping space for at least tunnel for > recoverypurposes. If it can't establish the tunnel, then it could retry 'x' > timesbefore concluding the tunnel recovery mechanism failed. This is much > betterthan current proposal where we'd establish one new tunnel for every old > tunnel. > [pwh4] If we want recovery to proceed expeditiously (and we really only havea > maximum of 1 hello timeout plus max retransmit timeout), then we reallyneed > to be able to recovery tunnels in parallel. This could be very > resourceintensive. It doesn't seem likely that a large number of tunnels > will necessarilybe coming from the same peer. I typically see not more > than a small handful(say 3) tunnels coming from the same peer to allow for > different servicepolicies. With this scenario, I'd need an additional 33% > resources to recoverthe tunnels in parallel - that's a rather steep price to > pay. Parallel recovery was definitely one of the design goals (Appendix A.2); So inline with your thinking if we wish to recover we'd need to reserve the resources. Having a three-way dialogue (to reset one another's control plane) is a MUST. Now, doing that on the existing tunnel is what we are evaluating. My proposal is that using the existing mechanism in the draft if we reset the old tunnel's control plane thereby keeping the data plane hitless (because old tunnel-id is intact) is something workable and fits in existing constructs of tunnel establishment, including individually authenticating peers upon restart. > Howdoes the transition to the new sequence number occur? I presume we > handoff the new sequence numbers to the reliable layer andit then purges > it'sre-ordering rx queue, renumbers any outstanding,transmits and > immediatelyre-transmits them? Any stale receives get automatically discarded > as outof window. > > - Renumbering outstanding transmits might create more unpredictability for > nobenefit. Because control plane on the failed node would have lost the > contextrelated to previous messages, it is better to flush everything off and > startwith new sequence numbers. > [pwh4] I disagree. Say the outstanding transmit is an ICCN. The > failednode may or may not have remembered sending the ICRP. By re-sending > theICCN, we allow the failed node at least the option of continuing the > setup. The target of the draft was to recover only the sessions that were in established state. This means if an endpoint does not keep track of session's intermediate state (i.e. ICRP sent, or awaiting ICCN) then it doesn't matter if the other sends an ICCN or not, it would be discarded upon control plane restart for the tunnel. > The state machines at the protocol layer should already be capable of > handlingan unexpected ICCN (or for that matter any unexpected packet). If > we wereto throw away these packets, then we're potentially requiring > additionalcomplication at the protocol layer. Agre; State Machine should be able to handle any packet in a life of a session or a tunnel. Regarding additional complication: - I think section 2.3.4 addresses the inconsistency among sesison states. - Renumbering the exisitng messages in transmit queue is not going to eliminate the conditions that could result in the situations described in 2.3.4, so that needs to be there anyways. Then why bother resending these messages? - If we are reovering only the sessions that were in established state then there is no need to retransmit messages for situations that could be handled otherwise by defined mechanisms. > I don't know how your protocol layeris implemented, but mine treats > the reliable layer as a pipe. > What I putinto it is guaranteed to be delivered or I get a tunnel failed > indication. It doesn't seem that I'd really want a tunnel failed indication > here, butthat would be my only choice if the reliable layer threw away a > packet thathad been submitted for transmit. From what I have seen, the reliable layer typically is like a pipe which will either reliably deliver a packet or provide with a tunnel failure indication. Therefore it makes doesn't make a difference if we renumber them or not from delivery perspective. Once queued they'll be delivered. But my point was - why even remark their sequence numbers? > The old tunnel becomes active only upon getting confirmation from its peer > thatit has reset control plane sequence numbers. So this would have to be a > threeway handshake as described below. For example, for an old tunnel:- > Failed node sends: "Reset Nr to 5665 for Old Local Tid = 9, Old Tid = 6".- > Non Failed node responds: "Nr for Old Tid = 9 reset to 5665, Reset Nr to > 4435for the same tunnel (i.e. Old Local Tid = 6, Old Tid = 9 from this > node'sperspective)". Failed endpoint upon getting this message must first > enqueue theresponse of this message and then start sending control messages > on the Oldtunnel.- Failed node sends: "Nr for Old Tid = 9 reset to 4435". Non > failed node upongetting this can start sending control messages. > [pwh4] I presume in the case where control messages with the new > sequencenumber space start arriving at the non-failed node before the final > resyncmessage, the messages get discarded as OOW. The failed node will then > re-transmitand they'll be accepted once the non-failed node gets the final > resync message. What we discuss above could work. However, to make it simpler I think simply resetting the control plane of old tunnel would be good enough. thanks, -- vipin e __________________________________ Do you Yahoo!? Protect your identity with Yahoo! Mail AddressGuard http://antispam.yahoo.com/whatsnewfree _______________________________________________ L2tpext mailing list L2tpext@ietf.org https://www1.ietf.org/mailman/listinfo/l2tpext From exim@www1.ietf.org Tue Nov 4 23:07:30 2003 Received: from optimus.ietf.org (ietf.org [132.151.1.19] (may be forged)) by ietf.org (8.9.1a/8.9.1a) with ESMTP id XAA22264 for ; Tue, 4 Nov 2003 23:07:30 -0500 (EST) Received: from localhost.localdomain ([127.0.0.1] helo=www1.ietf.org) by optimus.ietf.org with esmtp (Exim 4.20) id 1AHEwq-0000PO-Cz for l2tpext-archive@odin.ietf.org; Tue, 04 Nov 2003 23:07:13 -0500 Received: (from exim@localhost) by www1.ietf.org (8.12.8/8.12.8/Submit) id hA547BIM001559 for l2tpext-archive@odin.ietf.org; Tue, 4 Nov 2003 23:07:11 -0500 Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org) by optimus.ietf.org with esmtp (Exim 4.20) id 1AHEwn-0000Os-A2 for l2tpext-web-archive@optimus.ietf.org; Tue, 04 Nov 2003 23:07:09 -0500 Received: from ietf-mx (ietf-mx.ietf.org [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id XAA22224 for ; Tue, 4 Nov 2003 23:06:54 -0500 (EST) Received: from ietf-mx ([132.151.6.1]) by ietf-mx with esmtp (Exim 4.12) id 1AHEwj-0003iu-00 for l2tpext-web-archive@ietf.org; Tue, 04 Nov 2003 23:07:05 -0500 Received: from ietf.org ([132.151.1.19] helo=optimus.ietf.org) by ietf-mx with esmtp (Exim 4.12) id 1AHEwi-0003ir-00 for l2tpext-web-archive@ietf.org; Tue, 04 Nov 2003 23:07:04 -0500 Received: from localhost.localdomain ([127.0.0.1] helo=www1.ietf.org) by optimus.ietf.org with esmtp (Exim 4.20) id 1AHEwh-0000NS-QY; Tue, 04 Nov 2003 23:07:03 -0500 Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org) by optimus.ietf.org with esmtp (Exim 4.20) id 1AHEvv-0000IE-3t for l2tpext@optimus.ietf.org; Tue, 04 Nov 2003 23:06:15 -0500 Received: from ietf-mx (ietf-mx.ietf.org [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id XAA22141 for ; Tue, 4 Nov 2003 23:06:00 -0500 (EST) Received: from ietf-mx ([132.151.6.1]) by ietf-mx with esmtp (Exim 4.12) id 1AHEvp-0003hm-00 for L2tpext@ietf.org; Tue, 04 Nov 2003 23:06:09 -0500 Received: from web41301.mail.yahoo.com ([66.218.93.186]) by ietf-mx with smtp (Exim 4.12) id 1AHEvn-0003hH-00 for L2tpext@ietf.org; Tue, 04 Nov 2003 23:06:08 -0500 Message-ID: <20031105040538.90291.qmail@web41301.mail.yahoo.com> Received: from [66.17.149.13] by web41301.mail.yahoo.com via HTTP; Tue, 04 Nov 2003 20:05:38 PST Date: Tue, 4 Nov 2003 20:05:38 -0800 (PST) From: Vipin Jain To: Paul Howard Cc: l2tp ietflist , Keyur Parikh , vipin@riverstonenet.com In-Reply-To: <3FA7CEEE.9000209@juniper.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Subject: [L2tpext] Re: Questions about draft-ietf-l2tpext-failover-02.txt Sender: l2tpext-admin@ietf.org Errors-To: l2tpext-admin@ietf.org X-BeenThere: l2tpext@ietf.org X-Mailman-Version: 2.0.12 Precedence: bulk List-Unsubscribe: , List-Id: Layer Two Tunneling Protocol Extensions List-Post: List-Help: List-Subscribe: , hi Paul, my response inline.. > Reserved Bits: I am in for that. Based on the discussions earlier Mark andwe > agreed that if we can get it done by using a control plane message > exchangethere is no need to make a L2TP-header change for this. > [pwh4] I'm not sure how to interpret this comment. I agree we don't wantto > make changes where we can avoid them, but this needs to be balanced > againstproviding a resource efficient solution. By creating the concept of > unnumberedframes to do reliable layer signaling then we can have the reliable > layerre-sync the control plane without requiring a separate tunnel to do the > resyncand without requiring a tid change. IMHO, this would seem to justify > takinga reserved bit to indicate UI. I don't think introducing the concept of having unnumbered messages and changing header bits is a good idea. It would be bring in following problems: - How do you ack an unumbered message? More importantly how do you relibaly transmit them? Do they take same transmit queue and apply with rx window constrains? - What if a node is bombarded with such unnumbered messages? Are we suppose to interpret each one of them? > Using Ns=0, Nr=0 Scheme: This is flawed because it might naturally > coincidewith the sequence numbers of a tunnel thereby confusing this as > anacknowledgement for a packet sent earlier. > [pwh4] I don't think so. With unnumbered frames, there is no Ns/Nr thusthey > are ignored. I mentioned 0/0 only because they should probably beset to > something. The UI bit would be looked at by the reliable layer beforeit > looks at Ns/Nr. UI frames would be handled entirely by the reliablelayer > and not the protocol layer. Using Ns=0, Nr=0 when UI bit is set invites interpreting any message with Nr=0, Ns=0 and UI bit set - Does it invite a DoS attack? > One negativeto this approach is that it would require the systems to have the > resourcesavailable to at least temporarily manage the additional tunnels - > this mightimply being able to temporarily peak at double the normally > supported numberof tunnels. If the non-failed endpoint was at it's maximum > number of tunnels,how would it know to make an exception for the failover > tunnel setup? > And therefore we would RECOMMEND keeping space for at least tunnel for > recoverypurposes. If it can't establish the tunnel, then it could retry 'x' > timesbefore concluding the tunnel recovery mechanism failed. This is much > betterthan current proposal where we'd establish one new tunnel for every old > tunnel. > [pwh4] If we want recovery to proceed expeditiously (and we really only havea > maximum of 1 hello timeout plus max retransmit timeout), then we reallyneed > to be able to recovery tunnels in parallel. This could be very > resourceintensive. It doesn't seem likely that a large number of tunnels > will necessarilybe coming from the same peer. I typically see not more > than a small handful(say 3) tunnels coming from the same peer to allow for > different servicepolicies. With this scenario, I'd need an additional 33% > resources to recoverthe tunnels in parallel - that's a rather steep price to > pay. Parallel recovery was definitely one of the design goals (Appendix A.2); So inline with your thinking if we wish to recover we'd need to reserve the resources. Having a three-way dialogue (to reset one another's control plane) is a MUST. Now, doing that on the existing tunnel is what we are evaluating. My proposal is that using the existing mechanism in the draft if we reset the old tunnel's control plane thereby keeping the data plane hitless (because old tunnel-id is intact) is something workable and fits in existing constructs of tunnel establishment, including individually authenticating peers upon restart. > Howdoes the transition to the new sequence number occur? I presume we > handoff the new sequence numbers to the reliable layer andit then purges > it'sre-ordering rx queue, renumbers any outstanding,transmits and > immediatelyre-transmits them? Any stale receives get automatically discarded > as outof window. > > - Renumbering outstanding transmits might create more unpredictability for > nobenefit. Because control plane on the failed node would have lost the > contextrelated to previous messages, it is better to flush everything off and > startwith new sequence numbers. > [pwh4] I disagree. Say the outstanding transmit is an ICCN. The > failednode may or may not have remembered sending the ICRP. By re-sending > theICCN, we allow the failed node at least the option of continuing the > setup. The target of the draft was to recover only the sessions that were in established state. This means if an endpoint does not keep track of session's intermediate state (i.e. ICRP sent, or awaiting ICCN) then it doesn't matter if the other sends an ICCN or not, it would be discarded upon control plane restart for the tunnel. > The state machines at the protocol layer should already be capable of > handlingan unexpected ICCN (or for that matter any unexpected packet). If > we wereto throw away these packets, then we're potentially requiring > additionalcomplication at the protocol layer. Agre; State Machine should be able to handle any packet in a life of a session or a tunnel. Regarding additional complication: - I think section 2.3.4 addresses the inconsistency among sesison states. - Renumbering the exisitng messages in transmit queue is not going to eliminate the conditions that could result in the situations described in 2.3.4, so that needs to be there anyways. Then why bother resending these messages? - If we are reovering only the sessions that were in established state then there is no need to retransmit messages for situations that could be handled otherwise by defined mechanisms. > I don't know how your protocol layeris implemented, but mine treats > the reliable layer as a pipe. > What I putinto it is guaranteed to be delivered or I get a tunnel failed > indication. It doesn't seem that I'd really want a tunnel failed indication > here, butthat would be my only choice if the reliable layer threw away a > packet thathad been submitted for transmit. From what I have seen, the reliable layer typically is like a pipe which will either reliably deliver a packet or provide with a tunnel failure indication. Therefore it makes doesn't make a difference if we renumber them or not from delivery perspective. Once queued they'll be delivered. But my point was - why even remark their sequence numbers? > The old tunnel becomes active only upon getting confirmation from its peer > thatit has reset control plane sequence numbers. So this would have to be a > threeway handshake as described below. For example, for an old tunnel:- > Failed node sends: "Reset Nr to 5665 for Old Local Tid = 9, Old Tid = 6".- > Non Failed node responds: "Nr for Old Tid = 9 reset to 5665, Reset Nr to > 4435for the same tunnel (i.e. Old Local Tid = 6, Old Tid = 9 from this > node'sperspective)". Failed endpoint upon getting this message must first > enqueue theresponse of this message and then start sending control messages > on the Oldtunnel.- Failed node sends: "Nr for Old Tid = 9 reset to 4435". Non > failed node upongetting this can start sending control messages. > [pwh4] I presume in the case where control messages with the new > sequencenumber space start arriving at the non-failed node before the final > resyncmessage, the messages get discarded as OOW. The failed node will then > re-transmitand they'll be accepted once the non-failed node gets the final > resync message. What we discuss above could work. However, to make it simpler I think simply resetting the control plane of old tunnel would be good enough. thanks, -- vipin e __________________________________ Do you Yahoo!? Protect your identity with Yahoo! Mail AddressGuard http://antispam.yahoo.com/whatsnewfree _______________________________________________ L2tpext mailing list L2tpext@ietf.org https://www1.ietf.org/mailman/listinfo/l2tpext From l2tpext-admin@ietf.org Fri Nov 7 09:54:27 2003 Received: from optimus.ietf.org ([132.151.1.19]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id JAA16291 for ; Fri, 7 Nov 2003 09:54:27 -0500 (EST) Received: from localhost.localdomain ([127.0.0.1] helo=www1.ietf.org) by optimus.ietf.org with esmtp (Exim 4.20) id 1AI7zt-0001KZ-MK; Fri, 07 Nov 2003 09:54:01 -0500 Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org) by optimus.ietf.org with esmtp (Exim 4.20) id 1AI7zX-0001KF-Of for l2tpext@optimus.ietf.org; Fri, 07 Nov 2003 09:53:42 -0500 Received: from ietf-mx (ietf-mx.ietf.org [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id JAA16272 for ; Fri, 7 Nov 2003 09:53:27 -0500 (EST) Received: from ietf-mx ([132.151.6.1]) by ietf-mx with esmtp (Exim 4.12) id 1AI7zV-0005yz-00 for L2tpext@ietf.org; Fri, 07 Nov 2003 09:53:37 -0500 Received: from natint2.juniper.net ([207.17.136.150] helo=beta.jnpr.net) by ietf-mx with esmtp (Exim 4.12) id 1AI7zV-0005yg-00 for L2tpext@ietf.org; Fri, 07 Nov 2003 09:53:37 -0500 Received: from pi-smtp.jnpr.net ([10.10.2.36]) by beta.jnpr.net with Microsoft SMTPSVC(5.0.2195.6713); Fri, 7 Nov 2003 06:53:06 -0800 Received: from juniper.net ([10.10.248.245] RDNS failed) by pi-smtp.jnpr.net with Microsoft SMTPSVC(5.0.2195.6713); Fri, 7 Nov 2003 09:53:04 -0500 Message-ID: <3FABB101.40906@juniper.net> Date: Fri, 07 Nov 2003 09:49:37 -0500 From: Paul Howard User-Agent: Mozilla/5.0 (Windows; U; WinNT4.0; en-US; rv:1.0.2) Gecko/20030208 Netscape/7.02 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Vipin Jain CC: l2tp ietflist , Keyur Parikh , vipin@riverstonenet.com References: <20031105040538.90291.qmail@web41301.mail.yahoo.com> Content-Type: text/html; charset=us-ascii Content-Transfer-Encoding: 7bit X-OriginalArrivalTime: 07 Nov 2003 14:53:04.0841 (UTC) FILETIME=[D9479B90:01C3A53E] Content-Transfer-Encoding: 7bit Subject: [L2tpext] Re: Questions about draft-ietf-l2tpext-failover-02.txt Sender: l2tpext-admin@ietf.org Errors-To: l2tpext-admin@ietf.org X-BeenThere: l2tpext@ietf.org X-Mailman-Version: 2.0.12 Precedence: bulk List-Unsubscribe: , List-Id: Layer Two Tunneling Protocol Extensions List-Post: List-Help: List-Subscribe: , Content-Transfer-Encoding: 7bit Vipin,

Thanx for your responses.   I want to go back through the emails and the draft and carefully reconsider based on your responses.   It will take me a few days.   I'll try to respond by mid next week.

Thanx,

Paul

Vipin Jain wrote:
hi Paul,

my response inline..

  
Reserved Bits: I am in for that. Based on the discussions earlier Mark andwe 
agreed that if we can get it done by using a control plane message 
exchangethere is no need to make a L2TP-header change for this.
    

  
[pwh4] I'm not sure how to interpret this comment.   I agree we don't wantto 
make changes where we can avoid them, but this needs to be balanced 
againstproviding a resource efficient solution.   By creating the concept of 
unnumberedframes to do reliable layer signaling then we can have the reliable
    

  
layerre-sync the control plane without requiring a separate tunnel to do the 
resyncand without requiring a tid change.   IMHO, this would seem to justify 
takinga reserved bit to indicate UI.
    
I don't think introducing the concept of having unnumbered messages and
changing header bits is a good idea. It would be bring in following problems:
- How do you ack an unumbered message? More importantly how do you relibaly
transmit them? Do they take same transmit queue and apply with rx window
constrains?
- What if a node is bombarded with such unnumbered messages? Are we suppose to
interpret each one of them?

  
Using Ns=0, Nr=0 Scheme: This is flawed because it might naturally 
coincidewith the sequence numbers of a tunnel thereby confusing this as 
anacknowledgement for a packet sent earlier.
[pwh4] I don't think so.   With unnumbered frames, there is no Ns/Nr thusthey
    

  
are ignored.   I mentioned 0/0 only because they should probably beset to 
something.   The UI bit would be looked at by the reliable layer beforeit 
looks at Ns/Nr.   UI frames would be handled entirely by the reliablelayer 
and not the protocol layer.
    
Using Ns=0, Nr=0 when UI bit is set invites interpreting any message with Nr=0,
Ns=0 and UI bit set - Does it invite a DoS attack?  
      
  
One negativeto this approach is that it would require the systems to have the
resourcesavailable to at least temporarily manage the additional tunnels -
this mightimply being able to temporarily peak at double the normally 
supported numberof tunnels.   If the non-failed endpoint was at it's maximum 
number of tunnels,how would it know to make an exception for the failover 
tunnel setup?    
    
    
  
And therefore we would RECOMMEND keeping space for at least tunnel for 
recoverypurposes. If it can't establish the tunnel, then it could retry 'x' 
timesbefore concluding the tunnel recovery mechanism failed. This is much 
betterthan current proposal where we'd establish one new tunnel for every old
    

  
tunnel.  
    

  
[pwh4] If we want recovery to proceed expeditiously (and we really only havea
    

  
maximum of 1 hello timeout plus max retransmit timeout), then we reallyneed 
to be able to recovery tunnels in parallel.   This could be very 
resourceintensive.   It doesn't seem likely that a large number of tunnels 
will necessarilybe coming from the same peer.    I typically see not more 
than a small handful(say 3) tunnels coming from the same peer to allow for 
different servicepolicies.   With this scenario, I'd need an additional 33% 
resources to recoverthe tunnels in parallel - that's a rather steep price to 
pay. 
    
Parallel recovery was definitely one of the design goals (Appendix A.2); So
inline with your thinking if we wish to recover we'd need to reserve the
resources. Having a three-way dialogue (to reset one another's control plane)
is a MUST. Now, doing that on the existing tunnel is what we are evaluating.

My proposal is that using the existing mechanism in the draft if we reset the
old tunnel's control plane thereby keeping the data plane hitless (because old
tunnel-id is intact) is something workable and fits in existing constructs of
tunnel establishment, including individually authenticating peers upon restart.

  
Howdoes the transition to the new sequence number occur? I  presume we 
handoff the new sequence numbers to the reliable layer andit then purges 
it'sre-ordering rx queue, renumbers any outstanding,transmits and 
immediatelyre-transmits them?  Any stale receives get automatically discarded
    

  
as outof window.    
    
- Renumbering outstanding transmits might create more unpredictability for 
nobenefit. Because control plane on the failed node would have lost the 
contextrelated to previous messages, it is better to flush everything off and
    

  
startwith new sequence numbers.
    

  
[pwh4] I disagree.   Say the outstanding transmit is an ICCN.   The 
failednode may or may not have remembered sending the ICRP.   By re-sending 
theICCN, we allow the failed node at least the option of continuing the 
setup.  
    
The target of the draft was to recover only the sessions that were in
established state. This means if an endpoint does not keep track of session's
intermediate state (i.e. ICRP sent, or awaiting ICCN) then it doesn't matter if
the other sends an ICCN or not, it would be discarded upon control plane
restart for the tunnel.

  
The state machines at the protocol layer should already be capable of 
handlingan unexpected ICCN (or for that matter any unexpected packet).    If 
we wereto throw away these packets, then we're potentially requiring 
additionalcomplication at the protocol layer.   
    
Agre; State Machine should be able to handle any packet in a life of a session
or a tunnel. Regarding additional complication:
- I think section 2.3.4 addresses the inconsistency among sesison states.
- Renumbering the exisitng messages in transmit queue is not going to eliminate
the conditions that could result in the situations described in 2.3.4, so that
needs to be there anyways. Then why bother resending these messages?
- If we are reovering only the sessions that were in established state then
there is no need to retransmit messages for situations that could be handled
otherwise by defined mechanisms.

  
I don't know how your protocol layeris implemented, but mine treats
the reliable layer as a pipe.   
What I putinto it is guaranteed to be delivered or I get a tunnel failed 
indication.  It doesn't seem that I'd really want a tunnel failed indication 
here, butthat would be my only choice if the reliable layer threw away a 
packet thathad been submitted for transmit.
    
>From what I have seen, the reliable layer typically is like a pipe which will
either reliably deliver a packet or provide with a tunnel failure indication.
Therefore it makes doesn't make a difference if we renumber them or not from
delivery perspective. Once queued they'll be delivered. But my point was - why
even remark their sequence numbers?
  
  
The old tunnel becomes active only upon getting confirmation from its peer 
thatit has reset control plane sequence numbers. So this would have to be a 
threeway handshake as described below. For example, for an old tunnel:- 
Failed node sends: "Reset Nr to 5665 for Old Local Tid = 9, Old Tid = 6".- 
Non Failed node responds: "Nr for Old Tid = 9 reset to 5665, Reset Nr to 
4435for the same tunnel (i.e. Old Local Tid = 6, Old Tid = 9 from this 
node'sperspective)". Failed endpoint upon getting this message must first 
enqueue theresponse of this message and then start sending control messages 
on the Oldtunnel.- Failed node sends: "Nr for Old Tid = 9 reset to 4435". Non
    

  
failed node upongetting this can start sending control messages.  
    

  
[pwh4] I presume in the case where control messages with the new 
sequencenumber space start arriving at the non-failed node before the final 
resyncmessage, the messages get discarded as OOW.   The failed node will then
    

  
re-transmitand they'll be accepted once the non-failed node gets the final 
resync message.
    
What we discuss above could work. However, to make it simpler I think simply
resetting the control plane of old tunnel would be good enough.
  

thanks,
-- vipin
















e

__________________________________
Do you Yahoo!?
Protect your identity with Yahoo! Mail AddressGuard
http://antispam.yahoo.com/whatsnewfree

  

_______________________________________________ L2tpext mailing list L2tpext@ietf.org https://www1.ietf.org/mailman/listinfo/l2tpext From exim@www1.ietf.org Fri Nov 7 09:54:29 2003 Received: from optimus.ietf.org ([132.151.1.19]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id JAA16306 for ; Fri, 7 Nov 2003 09:54:29 -0500 (EST) Received: from localhost.localdomain ([127.0.0.1] helo=www1.ietf.org) by optimus.ietf.org with esmtp (Exim 4.20) id 1AI802-0001Lc-RL for l2tpext-archive@odin.ietf.org; Fri, 07 Nov 2003 09:54:11 -0500 Received: (from exim@localhost) by www1.ietf.org (8.12.8/8.12.8/Submit) id hA7EsAJg005174 for l2tpext-archive@odin.ietf.org; Fri, 7 Nov 2003 09:54:10 -0500 Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org) by optimus.ietf.org with esmtp (Exim 4.20) id 1AI802-0001LN-MF for l2tpext-web-archive@optimus.ietf.org; Fri, 07 Nov 2003 09:54:10 -0500 Received: from ietf-mx (ietf-mx.ietf.org [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id JAA16282 for ; Fri, 7 Nov 2003 09:53:58 -0500 (EST) Received: from ietf-mx ([132.151.6.1]) by ietf-mx with esmtp (Exim 4.12) id 1AI800-0005zN-00 for l2tpext-web-archive@ietf.org; Fri, 07 Nov 2003 09:54:08 -0500 Received: from [132.151.1.19] (helo=optimus.ietf.org) by ietf-mx with esmtp (Exim 4.12) id 1AI800-0005zJ-00 for l2tpext-web-archive@ietf.org; Fri, 07 Nov 2003 09:54:08 -0500 Received: from localhost.localdomain ([127.0.0.1] helo=www1.ietf.org) by optimus.ietf.org with esmtp (Exim 4.20) id 1AI7zt-0001KZ-MK; Fri, 07 Nov 2003 09:54:01 -0500 Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org) by optimus.ietf.org with esmtp (Exim 4.20) id 1AI7zX-0001KF-Of for l2tpext@optimus.ietf.org; Fri, 07 Nov 2003 09:53:42 -0500 Received: from ietf-mx (ietf-mx.ietf.org [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id JAA16272 for ; Fri, 7 Nov 2003 09:53:27 -0500 (EST) Received: from ietf-mx ([132.151.6.1]) by ietf-mx with esmtp (Exim 4.12) id 1AI7zV-0005yz-00 for L2tpext@ietf.org; Fri, 07 Nov 2003 09:53:37 -0500 Received: from natint2.juniper.net ([207.17.136.150] helo=beta.jnpr.net) by ietf-mx with esmtp (Exim 4.12) id 1AI7zV-0005yg-00 for L2tpext@ietf.org; Fri, 07 Nov 2003 09:53:37 -0500 Received: from pi-smtp.jnpr.net ([10.10.2.36]) by beta.jnpr.net with Microsoft SMTPSVC(5.0.2195.6713); Fri, 7 Nov 2003 06:53:06 -0800 Received: from juniper.net ([10.10.248.245] RDNS failed) by pi-smtp.jnpr.net with Microsoft SMTPSVC(5.0.2195.6713); Fri, 7 Nov 2003 09:53:04 -0500 Message-ID: <3FABB101.40906@juniper.net> Date: Fri, 07 Nov 2003 09:49:37 -0500 From: Paul Howard User-Agent: Mozilla/5.0 (Windows; U; WinNT4.0; en-US; rv:1.0.2) Gecko/20030208 Netscape/7.02 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Vipin Jain CC: l2tp ietflist , Keyur Parikh , vipin@riverstonenet.com References: <20031105040538.90291.qmail@web41301.mail.yahoo.com> Content-Type: text/html; charset=us-ascii Content-Transfer-Encoding: 7bit X-OriginalArrivalTime: 07 Nov 2003 14:53:04.0841 (UTC) FILETIME=[D9479B90:01C3A53E] Content-Transfer-Encoding: 7bit Subject: [L2tpext] Re: Questions about draft-ietf-l2tpext-failover-02.txt Sender: l2tpext-admin@ietf.org Errors-To: l2tpext-admin@ietf.org X-BeenThere: l2tpext@ietf.org X-Mailman-Version: 2.0.12 Precedence: bulk List-Unsubscribe: , List-Id: Layer Two Tunneling Protocol Extensions List-Post: List-Help: List-Subscribe: , Content-Transfer-Encoding: 7bit Content-Transfer-Encoding: 7bit Vipin,

Thanx for your responses.   I want to go back through the emails and the draft and carefully reconsider based on your responses.   It will take me a few days.   I'll try to respond by mid next week.

Thanx,

Paul

Vipin Jain wrote:
hi Paul,

my response inline..

  
Reserved Bits: I am in for that. Based on the discussions earlier Mark andwe 
agreed that if we can get it done by using a control plane message 
exchangethere is no need to make a L2TP-header change for this.
    

  
[pwh4] I'm not sure how to interpret this comment.   I agree we don't wantto 
make changes where we can avoid them, but this needs to be balanced 
againstproviding a resource efficient solution.   By creating the concept of 
unnumberedframes to do reliable layer signaling then we can have the reliable
    

  
layerre-sync the control plane without requiring a separate tunnel to do the 
resyncand without requiring a tid change.   IMHO, this would seem to justify 
takinga reserved bit to indicate UI.
    
I don't think introducing the concept of having unnumbered messages and
changing header bits is a good idea. It would be bring in following problems:
- How do you ack an unumbered message? More importantly how do you relibaly
transmit them? Do they take same transmit queue and apply with rx window
constrains?
- What if a node is bombarded with such unnumbered messages? Are we suppose to
interpret each one of them?

  
Using Ns=0, Nr=0 Scheme: This is flawed because it might naturally 
coincidewith the sequence numbers of a tunnel thereby confusing this as 
anacknowledgement for a packet sent earlier.
[pwh4] I don't think so.   With unnumbered frames, there is no Ns/Nr thusthey
    

  
are ignored.   I mentioned 0/0 only because they should probably beset to 
something.   The UI bit would be looked at by the reliable layer beforeit 
looks at Ns/Nr.   UI frames would be handled entirely by the reliablelayer 
and not the protocol layer.
    
Using Ns=0, Nr=0 when UI bit is set invites interpreting any message with Nr=0,
Ns=0 and UI bit set - Does it invite a DoS attack?  
      
  
One negativeto this approach is that it would require the systems to have the
resourcesavailable to at least temporarily manage the additional tunnels -
this mightimply being able to temporarily peak at double the normally 
supported numberof tunnels.   If the non-failed endpoint was at it's maximum 
number of tunnels,how would it know to make an exception for the failover 
tunnel setup?    
    
    
  
And therefore we would RECOMMEND keeping space for at least tunnel for 
recoverypurposes. If it can't establish the tunnel, then it could retry 'x' 
timesbefore concluding the tunnel recovery mechanism failed. This is much 
betterthan current proposal where we'd establish one new tunnel for every old
    

  
tunnel.  
    

  
[pwh4] If we want recovery to proceed expeditiously (and we really only havea
    

  
maximum of 1 hello timeout plus max retransmit timeout), then we reallyneed 
to be able to recovery tunnels in parallel.   This could be very 
resourceintensive.   It doesn't seem likely that a large number of tunnels 
will necessarilybe coming from the same peer.    I typically see not more 
than a small handful(say 3) tunnels coming from the same peer to allow for 
different servicepolicies.   With this scenario, I'd need an additional 33% 
resources to recoverthe tunnels in parallel - that's a rather steep price to 
pay. 
    
Parallel recovery was definitely one of the design goals (Appendix A.2); So
inline with your thinking if we wish to recover we'd need to reserve the
resources. Having a three-way dialogue (to reset one another's control plane)
is a MUST. Now, doing that on the existing tunnel is what we are evaluating.

My proposal is that using the existing mechanism in the draft if we reset the
old tunnel's control plane thereby keeping the data plane hitless (because old
tunnel-id is intact) is something workable and fits in existing constructs of
tunnel establishment, including individually authenticating peers upon restart.

  
Howdoes the transition to the new sequence number occur? I  presume we 
handoff the new sequence numbers to the reliable layer andit then purges 
it'sre-ordering rx queue, renumbers any outstanding,transmits and 
immediatelyre-transmits them?  Any stale receives get automatically discarded
    

  
as outof window.    
    
- Renumbering outstanding transmits might create more unpredictability for 
nobenefit. Because control plane on the failed node would have lost the 
contextrelated to previous messages, it is better to flush everything off and
    

  
startwith new sequence numbers.
    

  
[pwh4] I disagree.   Say the outstanding transmit is an ICCN.   The 
failednode may or may not have remembered sending the ICRP.   By re-sending 
theICCN, we allow the failed node at least the option of continuing the 
setup.  
    
The target of the draft was to recover only the sessions that were in
established state. This means if an endpoint does not keep track of session's
intermediate state (i.e. ICRP sent, or awaiting ICCN) then it doesn't matter if
the other sends an ICCN or not, it would be discarded upon control plane
restart for the tunnel.

  
The state machines at the protocol layer should already be capable of 
handlingan unexpected ICCN (or for that matter any unexpected packet).    If 
we wereto throw away these packets, then we're potentially requiring 
additionalcomplication at the protocol layer.   
    
Agre; State Machine should be able to handle any packet in a life of a session
or a tunnel. Regarding additional complication:
- I think section 2.3.4 addresses the inconsistency among sesison states.
- Renumbering the exisitng messages in transmit queue is not going to eliminate
the conditions that could result in the situations described in 2.3.4, so that
needs to be there anyways. Then why bother resending these messages?
- If we are reovering only the sessions that were in established state then
there is no need to retransmit messages for situations that could be handled
otherwise by defined mechanisms.

  
I don't know how your protocol layeris implemented, but mine treats
the reliable layer as a pipe.   
What I putinto it is guaranteed to be delivered or I get a tunnel failed 
indication.  It doesn't seem that I'd really want a tunnel failed indication 
here, butthat would be my only choice if the reliable layer threw away a 
packet thathad been submitted for transmit.
    
>From what I have seen, the reliable layer typically is like a pipe which will
either reliably deliver a packet or provide with a tunnel failure indication.
Therefore it makes doesn't make a difference if we renumber them or not from
delivery perspective. Once queued they'll be delivered. But my point was - why
even remark their sequence numbers?
  
  
The old tunnel becomes active only upon getting confirmation from its peer 
thatit has reset control plane sequence numbers. So this would have to be a 
threeway handshake as described below. For example, for an old tunnel:- 
Failed node sends: "Reset Nr to 5665 for Old Local Tid = 9, Old Tid = 6".- 
Non Failed node responds: "Nr for Old Tid = 9 reset to 5665, Reset Nr to 
4435for the same tunnel (i.e. Old Local Tid = 6, Old Tid = 9 from this 
node'sperspective)". Failed endpoint upon getting this message must first 
enqueue theresponse of this message and then start sending control messages 
on the Oldtunnel.- Failed node sends: "Nr for Old Tid = 9 reset to 4435". Non
    

  
failed node upongetting this can start sending control messages.  
    

  
[pwh4] I presume in the case where control messages with the new 
sequencenumber space start arriving at the non-failed node before the final 
resyncmessage, the messages get discarded as OOW.   The failed node will then
    

  
re-transmitand they'll be accepted once the non-failed node gets the final 
resync message.
    
What we discuss above could work. However, to make it simpler I think simply
resetting the control plane of old tunnel would be good enough.
  

thanks,
-- vipin
















e

__________________________________
Do you Yahoo!?
Protect your identity with Yahoo! Mail AddressGuard
http://antispam.yahoo.com/whatsnewfree

  

_______________________________________________ L2tpext mailing list L2tpext@ietf.org https://www1.ietf.org/mailman/listinfo/l2tpext From l2tpext-admin@ietf.org Wed Nov 12 17:25:35 2003 Received: from optimus.ietf.org ([132.151.1.19]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id RAA13210 for ; Wed, 12 Nov 2003 17:25:35 -0500 (EST) Received: from localhost.localdomain ([127.0.0.1] helo=www1.ietf.org) by optimus.ietf.org with esmtp (Exim 4.20) id 1AK3Q7-00045Y-Dh; Wed, 12 Nov 2003 17:25:03 -0500 Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org) by optimus.ietf.org with esmtp (Exim 4.20) id 1AK3PX-000449-Jo for l2tpext@optimus.ietf.org; Wed, 12 Nov 2003 17:24:30 -0500 Received: from ietf-mx (ietf-mx.ietf.org [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id RAA13170 for ; Wed, 12 Nov 2003 17:24:09 -0500 (EST) Received: from ietf-mx ([132.151.6.1]) by ietf-mx with esmtp (Exim 4.12) id 1AK3PQ-0007gE-00 for L2tpext@ietf.org; Wed, 12 Nov 2003 17:24:20 -0500 Received: from westford-nat.juniper.net ([65.194.140.2] helo=pi-smtp.jnpr.net) by ietf-mx with esmtp (Exim 4.12) id 1AK3PP-0007fC-00 for L2tpext@ietf.org; Wed, 12 Nov 2003 17:24:19 -0500 Received: from juniper.net ([10.10.248.153] RDNS failed) by pi-smtp.jnpr.net with Microsoft SMTPSVC(5.0.2195.6713); Wed, 12 Nov 2003 17:23:48 -0500 Message-ID: <3FB2B20D.3000205@juniper.net> Date: Wed, 12 Nov 2003 17:19:57 -0500 From: Paul Howard User-Agent: Mozilla/5.0 (Windows; U; WinNT4.0; en-US; rv:1.0.2) Gecko/20030208 Netscape/7.02 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Vipin Jain CC: l2tp ietflist , Keyur Parikh , vipin@riverstonenet.com References: <20031105040538.90291.qmail@web41301.mail.yahoo.com> Content-Type: text/html; charset=us-ascii Content-Transfer-Encoding: 7bit X-OriginalArrivalTime: 12 Nov 2003 22:23:48.0075 (UTC) FILETIME=[A45E63B0:01C3A96B] Content-Transfer-Encoding: 7bit Subject: [L2tpext] Re: Questions about draft-ietf-l2tpext-failover-02.txt Sender: l2tpext-admin@ietf.org Errors-To: l2tpext-admin@ietf.org X-BeenThere: l2tpext@ietf.org X-Mailman-Version: 2.0.12 Precedence: bulk List-Unsubscribe: , List-Id: Layer Two Tunneling Protocol Extensions List-Post: List-Help: List-Subscribe: , Content-Transfer-Encoding: 7bit Vipin,

Thanx for your responses.  Rather than inlining further comments, I've tried to summarize (from my perspective at least) where things stand.  I've tried to bring out all of the proposals that we've discussed, but may have missed some in the morass our e-mail has become :-)

It seems that the crux of the discussion boils down to whether to use a separate recovery tunnel for each extant tunnel (as described in the current draft) or to do recovery within each tunnel.   Any solution needs to allow for the possibility of a completely hitless data plane (i.e. 0 data packet loss).   Discussions so far have focused upon a basic 3 way handshake to perform the recovery of the control connection's sequence numbers regardless of whether a separate recovery tunnel is used.

Using a separate recovery tunnel seems to present the following advantages and disadvantages:

Plus:

- Leverages extant reliable packet delivery of L2TP control connection via separate recovery tunnel

Minus:

- To allow parallelism (necessary to recover in a timely fashion and avoid disconnects due to tunnel hello failures), the endpoints must be capable of supporting some number of tunnels over and above their normal operating limits.   Given my implementation currently runs up to 8,000 tunnels, I'd be looking at having to allow transient peaks of 16,000 tunnels for fully parallel recovery.   Using less than a 100% resource reserve and a retry mechanism as you have proposed (thus doing only a subset of the tunnels in parallel) would reduce the resource requirements, but increase the likelihood of tunnels failing due to increased latency in tunnel recovery.
- Unless the semantics are changed to allow two tunnels with the same TIDs, the TIDs of the recovery tunnel must differ from the old tunnel.   This implies re-programming of the data plane for all active sessions and, most likely, at least a brief interruption of data flow.    Your proposal to reset the old tunnel's sequence numbers instead of replacing the old tunnel with the recovery tunnel would address this issue.   It does, however, carry with it additional packet overhead (due to the required shutdown of the recovery tunnel as opposed to the silent abandonment of the old tunnel).

Using the exisiting tunnel to recover from the failover:

Plus:

- No additional resources required to allow fully parallel recovery
- No reprogramming of the data plane required (in the case of a control plane only failover) - thus no interruption of the data flow.

Minus:

- Reliable packet delivery mechanisms of the control connection are unavailable to do the recovery handshake (since one side has lost it's knowledge of Ns and Nr).   This implies that the handshake must be done in an unnumbered mode.   Procedures for reliable delivery and acknowledgement would need to be provided; however, the nature of the 3 way handshake deals with most of these issues.

You mentioned specific concerns about unnumbered mode in your last response.

 - Acknowledgements - A possible approach is that all but the last packet of the 3 way handshake is handled by the next packet in the handshake; the last packet in the handshake delivers the last of the sequence number reset data and thus can be acknowledged by requiring a normal ZLB Ack upon receipt.
- Reliable transmit - It seems that retransmits of outstanding handshake frames pending acknowledgement would handle this issue.
- Transmit queue - The normal transmit queue of the control connection is out of commission pending re-sync of the sequence numbers.   Any packets on this queue would be held pending conclusion of the resync.
- Receive window constraints - The receive window is also out of commission pending re-sync of the sequence numbers thus it's not clear how it could even be applied.   The 3 way handshake does in itself effectively enforce a flow control with an RWS of 1.  This shouldn't be an issue since nothing else can happen on the tunnel's control connection pending resync.
- Bombarded with unnumbered traffic - This could happen with any frame (numbered or unnumbered) and a non-well behaved peer (or hacker).   The control connection would have to do at least some packet examination to discard (but it does for bogus numbered mode frames as well).   If a more efficient discard of such frames was an issue, then an exchange of re-sync cookies as part of the initial control connection setup would allow more efficient discard of bogus frames and allow preliminary validation of a resync request (all unnumbered mode frames would be required to carrry the appropriate resync cookie).
- DOS attack - I'm assuming a hacker with no ability to snoop (if the hacker can snoop, then they can just send the appropriate StopCCN and any protections on the resync mechanism or issues with numbered vs unnumbered mode are moot).   Up to the point of the hacker guessing source IP, dest IP, and TID there don't seem to be any difference in the susceptabilty of numbered vs unnumbered mode.   After this point, numbered mode has the advantage of requiring the proper guess for Ns.   The use of a resync cookie (or more generically a unnumbered mode cookie) in all unnumbered frames would provide an equivalent level of protection for unnumbered mode.

Other outstanding issues that I see in the e-mail thread:

- What to do about frames on the transmit queue of the control connection?    I had suggested renumbering them once the new sequence numbers had been established.   You suggested discarding all of the frames.   I think either approach will work.   I'm concerned about some of the issues arising from a decision to discard - primarily the impact on the protocol layer which can no longer submit frames and expect them to be unconditionally delivered baring a tunnel failure.   Consider an established session and the non failed endpoint has just queued a CDN for transmit when a failover occurs.   Both endpoints think the session is established, but with a discard of the pending transmits, the CDN never gets sent.  Now the FSQ/FSR mechanism would detect this; however, the discard of the CDN may have tossed information of interest to the peer (e.g disconnect cause from RFC 3145).   I guess I don't seem the harm in re-numbering upon resync.   Any frames in the transmit queue are by definition outstanding (some may not even have been sent yet due to flow control).   The frames may or may not have been received at the peer before the failover (and if received may or may not have been remembered).   The worst case is the frame was received and remembered at which point the protocol layer at the peer will get a second copy of the packet (the control connection duplicate elimination won't catch it) and the session will most likely get torn down as a result (since the 2nd packet would arrive while the state machine is not expecting it).   Frames not remembered would continue whatever action was being attempted before the failover without harm.    I suppose that if failover is slow enough, any frames in the transmit queue would be so stale as to be not worth sending.   I question whether this is justification to toss the frames in the transmit queue given that failover may occur fast enough to avoid imposing any staleness on these frames.   A decision to toss outstanding frames means the protocol layer must be adjusted to deal with transmit failure in the absence of tunnel failure - thus e.g. an established session may have to re-submit a CDN.  In one of your comments, you brought up re-transmitting without re-numbering.   I believe this choice will cause the tunnel to fail since such packets will not be acknowledged by the peer causing re-transmission of the packets and eventual tunnel failure.   Having said all of the above, it's probably reasonable that this be an implementation decision (the implementation must either discard the contents of the transmit queue or re-number the frames; the implementation must not transmit without re-numbering).   It may be worthwhile to have a section describing the alternatives and the potential issues with each.

Thanx,

Paul

Vipin Jain wrote:
hi Paul,

my response inline..

  
Reserved Bits: I am in for that. Based on the discussions earlier Mark andwe 
agreed that if we can get it done by using a control plane message 
exchangethere is no need to make a L2TP-header change for this.
    

  
[pwh4] I'm not sure how to interpret this comment.   I agree we don't wantto 
make changes where we can avoid them, but this needs to be balanced 
againstproviding a resource efficient solution.   By creating the concept of 
unnumberedframes to do reliable layer signaling then we can have the reliable
    

  
layerre-sync the control plane without requiring a separate tunnel to do the 
resyncand without requiring a tid change.   IMHO, this would seem to justify 
takinga reserved bit to indicate UI.
    
I don't think introducing the concept of having unnumbered messages and
changing header bits is a good idea. It would be bring in following problems:
- How do you ack an unumbered message? More importantly how do you relibaly
transmit them? Do they take same transmit queue and apply with rx window
constrains?
- What if a node is bombarded with such unnumbered messages? Are we suppose to
interpret each one of them?

  
Using Ns=0, Nr=0 Scheme: This is flawed because it might naturally 
coincidewith the sequence numbers of a tunnel thereby confusing this as 
anacknowledgement for a packet sent earlier.
[pwh4] I don't think so.   With unnumbered frames, there is no Ns/Nr thusthey
    

  
are ignored.   I mentioned 0/0 only because they should probably beset to 
something.   The UI bit would be looked at by the reliable layer beforeit 
looks at Ns/Nr.   UI frames would be handled entirely by the reliablelayer 
and not the protocol layer.
    
Using Ns=0, Nr=0 when UI bit is set invites interpreting any message with Nr=0,
Ns=0 and UI bit set - Does it invite a DoS attack?  
      
  
One negativeto this approach is that it would require the systems to have the
resourcesavailable to at least temporarily manage the additional tunnels -
this mightimply being able to temporarily peak at double the normally 
supported numberof tunnels.   If the non-failed endpoint was at it's maximum 
number of tunnels,how would it know to make an exception for the failover 
tunnel setup?    
    
    
  
And therefore we would RECOMMEND keeping space for at least tunnel for 
recoverypurposes. If it can't establish the tunnel, then it could retry 'x' 
timesbefore concluding the tunnel recovery mechanism failed. This is much 
betterthan current proposal where we'd establish one new tunnel for every old
    

  
tunnel.  
    

  
[pwh4] If we want recovery to proceed expeditiously (and we really only havea
    

  
maximum of 1 hello timeout plus max retransmit timeout), then we reallyneed 
to be able to recovery tunnels in parallel.   This could be very 
resourceintensive.   It doesn't seem likely that a large number of tunnels 
will necessarilybe coming from the same peer.    I typically see not more 
than a small handful(say 3) tunnels coming from the same peer to allow for 
different servicepolicies.   With this scenario, I'd need an additional 33% 
resources to recoverthe tunnels in parallel - that's a rather steep price to 
pay. 
    
Parallel recovery was definitely one of the design goals (Appendix A.2); So
inline with your thinking if we wish to recover we'd need to reserve the
resources. Having a three-way dialogue (to reset one another's control plane)
is a MUST. Now, doing that on the existing tunnel is what we are evaluating.

My proposal is that using the existing mechanism in the draft if we reset the
old tunnel's control plane thereby keeping the data plane hitless (because old
tunnel-id is intact) is something workable and fits in existing constructs of
tunnel establishment, including individually authenticating peers upon restart.

  
Howdoes the transition to the new sequence number occur? I  presume we 
handoff the new sequence numbers to the reliable layer andit then purges 
it'sre-ordering rx queue, renumbers any outstanding,transmits and 
immediatelyre-transmits them?  Any stale receives get automatically discarded
    

  
as outof window.    
    
- Renumbering outstanding transmits might create more unpredictability for 
nobenefit. Because control plane on the failed node would have lost the 
contextrelated to previous messages, it is better to flush everything off and
    

  
startwith new sequence numbers.
    

  
[pwh4] I disagree.   Say the outstanding transmit is an ICCN.   The 
failednode may or may not have remembered sending the ICRP.   By re-sending 
theICCN, we allow the failed node at least the option of continuing the 
setup.  
    
The target of the draft was to recover only the sessions that were in
established state. This means if an endpoint does not keep track of session's
intermediate state (i.e. ICRP sent, or awaiting ICCN) then it doesn't matter if
the other sends an ICCN or not, it would be discarded upon control plane
restart for the tunnel.

  
The state machines at the protocol layer should already be capable of 
handlingan unexpected ICCN (or for that matter any unexpected packet).    If 
we wereto throw away these packets, then we're potentially requiring 
additionalcomplication at the protocol layer.   
    
Agre; State Machine should be able to handle any packet in a life of a session
or a tunnel. Regarding additional complication:
- I think section 2.3.4 addresses the inconsistency among sesison states.
- Renumbering the exisitng messages in transmit queue is not going to eliminate
the conditions that could result in the situations described in 2.3.4, so that
needs to be there anyways. Then why bother resending these messages?
- If we are reovering only the sessions that were in established state then
there is no need to retransmit messages for situations that could be handled
otherwise by defined mechanisms.

  
I don't know how your protocol layeris implemented, but mine treats
the reliable layer as a pipe.   
What I putinto it is guaranteed to be delivered or I get a tunnel failed 
indication.  It doesn't seem that I'd really want a tunnel failed indication 
here, butthat would be my only choice if the reliable layer threw away a 
packet thathad been submitted for transmit.
    
>From what I have seen, the reliable layer typically is like a pipe which will
either reliably deliver a packet or provide with a tunnel failure indication.
Therefore it makes doesn't make a difference if we renumber them or not from
delivery perspective. Once queued they'll be delivered. But my point was - why
even remark their sequence numbers?
  
  
The old tunnel becomes active only upon getting confirmation from its peer 
thatit has reset control plane sequence numbers. So this would have to be a 
threeway handshake as described below. For example, for an old tunnel:- 
Failed node sends: "Reset Nr to 5665 for Old Local Tid = 9, Old Tid = 6".- 
Non Failed node responds: "Nr for Old Tid = 9 reset to 5665, Reset Nr to 
4435for the same tunnel (i.e. Old Local Tid = 6, Old Tid = 9 from this 
node'sperspective)". Failed endpoint upon getting this message must first 
enqueue theresponse of this message and then start sending control messages 
on the Oldtunnel.- Failed node sends: "Nr for Old Tid = 9 reset to 4435". Non
    

  
failed node upongetting this can start sending control messages.  
    

  
[pwh4] I presume in the case where control messages with the new 
sequencenumber space start arriving at the non-failed node before the final 
resyncmessage, the messages get discarded as OOW.   The failed node will then
    

  
re-transmitand they'll be accepted once the non-failed node gets the final 
resync message.
    
What we discuss above could work. However, to make it simpler I think simply
resetting the control plane of old tunnel would be good enough.
  

thanks,
-- vipin
















e

__________________________________
Do you Yahoo!?
Protect your identity with Yahoo! Mail AddressGuard
http://antispam.yahoo.com/whatsnewfree

  

_______________________________________________ L2tpext mailing list L2tpext@ietf.org https://www1.ietf.org/mailman/listinfo/l2tpext From exim@www1.ietf.org Wed Nov 12 17:25:42 2003 Received: from optimus.ietf.org ([132.151.1.19]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id RAA13237 for ; Wed, 12 Nov 2003 17:25:42 -0500 (EST) Received: from localhost.localdomain ([127.0.0.1] helo=www1.ietf.org) by optimus.ietf.org with esmtp (Exim 4.20) id 1AK3QR-00047x-7g for l2tpext-archive@odin.ietf.org; Wed, 12 Nov 2003 17:25:25 -0500 Received: (from exim@localhost) by www1.ietf.org (8.12.8/8.12.8/Submit) id hACMPN3d015859 for l2tpext-archive@odin.ietf.org; Wed, 12 Nov 2003 17:25:23 -0500 Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org) by optimus.ietf.org with esmtp (Exim 4.20) id 1AK3QR-00047i-2w for l2tpext-web-archive@optimus.ietf.org; Wed, 12 Nov 2003 17:25:23 -0500 Received: from ietf-mx (ietf-mx.ietf.org [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id RAA13205 for ; Wed, 12 Nov 2003 17:25:09 -0500 (EST) Received: from ietf-mx ([132.151.6.1]) by ietf-mx with esmtp (Exim 4.12) id 1AK3QO-0007hU-00 for l2tpext-web-archive@ietf.org; Wed, 12 Nov 2003 17:25:20 -0500 Received: from [132.151.1.19] (helo=optimus.ietf.org) by ietf-mx with esmtp (Exim 4.12) id 1AK3QO-0007hO-00 for l2tpext-web-archive@ietf.org; Wed, 12 Nov 2003 17:25:20 -0500 Received: from localhost.localdomain ([127.0.0.1] helo=www1.ietf.org) by optimus.ietf.org with esmtp (Exim 4.20) id 1AK3Q7-00045Y-Dh; Wed, 12 Nov 2003 17:25:03 -0500 Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org) by optimus.ietf.org with esmtp (Exim 4.20) id 1AK3PX-000449-Jo for l2tpext@optimus.ietf.org; Wed, 12 Nov 2003 17:24:30 -0500 Received: from ietf-mx (ietf-mx.ietf.org [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id RAA13170 for ; Wed, 12 Nov 2003 17:24:09 -0500 (EST) Received: from ietf-mx ([132.151.6.1]) by ietf-mx with esmtp (Exim 4.12) id 1AK3PQ-0007gE-00 for L2tpext@ietf.org; Wed, 12 Nov 2003 17:24:20 -0500 Received: from westford-nat.juniper.net ([65.194.140.2] helo=pi-smtp.jnpr.net) by ietf-mx with esmtp (Exim 4.12) id 1AK3PP-0007fC-00 for L2tpext@ietf.org; Wed, 12 Nov 2003 17:24:19 -0500 Received: from juniper.net ([10.10.248.153] RDNS failed) by pi-smtp.jnpr.net with Microsoft SMTPSVC(5.0.2195.6713); Wed, 12 Nov 2003 17:23:48 -0500 Message-ID: <3FB2B20D.3000205@juniper.net> Date: Wed, 12 Nov 2003 17:19:57 -0500 From: Paul Howard User-Agent: Mozilla/5.0 (Windows; U; WinNT4.0; en-US; rv:1.0.2) Gecko/20030208 Netscape/7.02 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Vipin Jain CC: l2tp ietflist , Keyur Parikh , vipin@riverstonenet.com References: <20031105040538.90291.qmail@web41301.mail.yahoo.com> Content-Type: text/html; charset=us-ascii Content-Transfer-Encoding: 7bit X-OriginalArrivalTime: 12 Nov 2003 22:23:48.0075 (UTC) FILETIME=[A45E63B0:01C3A96B] Content-Transfer-Encoding: 7bit Subject: [L2tpext] Re: Questions about draft-ietf-l2tpext-failover-02.txt Sender: l2tpext-admin@ietf.org Errors-To: l2tpext-admin@ietf.org X-BeenThere: l2tpext@ietf.org X-Mailman-Version: 2.0.12 Precedence: bulk List-Unsubscribe: , List-Id: Layer Two Tunneling Protocol Extensions List-Post: List-Help: List-Subscribe: , Content-Transfer-Encoding: 7bit Content-Transfer-Encoding: 7bit Vipin,

Thanx for your responses.  Rather than inlining further comments, I've tried to summarize (from my perspective at least) where things stand.  I've tried to bring out all of the proposals that we've discussed, but may have missed some in the morass our e-mail has become :-)

It seems that the crux of the discussion boils down to whether to use a separate recovery tunnel for each extant tunnel (as described in the current draft) or to do recovery within each tunnel.   Any solution needs to allow for the possibility of a completely hitless data plane (i.e. 0 data packet loss).   Discussions so far have focused upon a basic 3 way handshake to perform the recovery of the control connection's sequence numbers regardless of whether a separate recovery tunnel is used.

Using a separate recovery tunnel seems to present the following advantages and disadvantages:

Plus:

- Leverages extant reliable packet delivery of L2TP control connection via separate recovery tunnel

Minus:

- To allow parallelism (necessary to recover in a timely fashion and avoid disconnects due to tunnel hello failures), the endpoints must be capable of supporting some number of tunnels over and above their normal operating limits.   Given my implementation currently runs up to 8,000 tunnels, I'd be looking at having to allow transient peaks of 16,000 tunnels for fully parallel recovery.   Using less than a 100% resource reserve and a retry mechanism as you have proposed (thus doing only a subset of the tunnels in parallel) would reduce the resource requirements, but increase the likelihood of tunnels failing due to increased latency in tunnel recovery.
- Unless the semantics are changed to allow two tunnels with the same TIDs, the TIDs of the recovery tunnel must differ from the old tunnel.   This implies re-programming of the data plane for all active sessions and, most likely, at least a brief interruption of data flow.    Your proposal to reset the old tunnel's sequence numbers instead of replacing the old tunnel with the recovery tunnel would address this issue.   It does, however, carry with it additional packet overhead (due to the required shutdown of the recovery tunnel as opposed to the silent abandonment of the old tunnel).

Using the exisiting tunnel to recover from the failover:

Plus:

- No additional resources required to allow fully parallel recovery
- No reprogramming of the data plane required (in the case of a control plane only failover) - thus no interruption of the data flow.

Minus:

- Reliable packet delivery mechanisms of the control connection are unavailable to do the recovery handshake (since one side has lost it's knowledge of Ns and Nr).   This implies that the handshake must be done in an unnumbered mode.   Procedures for reliable delivery and acknowledgement would need to be provided; however, the nature of the 3 way handshake deals with most of these issues.

You mentioned specific concerns about unnumbered mode in your last response.

 - Acknowledgements - A possible approach is that all but the last packet of the 3 way handshake is handled by the next packet in the handshake; the last packet in the handshake delivers the last of the sequence number reset data and thus can be acknowledged by requiring a normal ZLB Ack upon receipt.
- Reliable transmit - It seems that retransmits of outstanding handshake frames pending acknowledgement would handle this issue.
- Transmit queue - The normal transmit queue of the control connection is out of commission pending re-sync of the sequence numbers.   Any packets on this queue would be held pending conclusion of the resync.
- Receive window constraints - The receive window is also out of commission pending re-sync of the sequence numbers thus it's not clear how it could even be applied.   The 3 way handshake does in itself effectively enforce a flow control with an RWS of 1.  This shouldn't be an issue since nothing else can happen on the tunnel's control connection pending resync.
- Bombarded with unnumbered traffic - This could happen with any frame (numbered or unnumbered) and a non-well behaved peer (or hacker).   The control connection would have to do at least some packet examination to discard (but it does for bogus numbered mode frames as well).   If a more efficient discard of such frames was an issue, then an exchange of re-sync cookies as part of the initial control connection setup would allow more efficient discard of bogus frames and allow preliminary validation of a resync request (all unnumbered mode frames would be required to carrry the appropriate resync cookie).
- DOS attack - I'm assuming a hacker with no ability to snoop (if the hacker can snoop, then they can just send the appropriate StopCCN and any protections on the resync mechanism or issues with numbered vs unnumbered mode are moot).   Up to the point of the hacker guessing source IP, dest IP, and TID there don't seem to be any difference in the susceptabilty of numbered vs unnumbered mode.   After this point, numbered mode has the advantage of requiring the proper guess for Ns.   The use of a resync cookie (or more generically a unnumbered mode cookie) in all unnumbered frames would provide an equivalent level of protection for unnumbered mode.

Other outstanding issues that I see in the e-mail thread:

- What to do about frames on the transmit queue of the control connection?    I had suggested renumbering them once the new sequence numbers had been established.   You suggested discarding all of the frames.   I think either approach will work.   I'm concerned about some of the issues arising from a decision to discard - primarily the impact on the protocol layer which can no longer submit frames and expect them to be unconditionally delivered baring a tunnel failure.   Consider an established session and the non failed endpoint has just queued a CDN for transmit when a failover occurs.   Both endpoints think the session is established, but with a discard of the pending transmits, the CDN never gets sent.  Now the FSQ/FSR mechanism would detect this; however, the discard of the CDN may have tossed information of interest to the peer (e.g disconnect cause from RFC 3145).   I guess I don't seem the harm in re-numbering upon resync.   Any frames in the transmit queue are by definition outstanding (some may not even have been sent yet due to flow control).   The frames may or may not have been received at the peer before the failover (and if received may or may not have been remembered).   The worst case is the frame was received and remembered at which point the protocol layer at the peer will get a second copy of the packet (the control connection duplicate elimination won't catch it) and the session will most likely get torn down as a result (since the 2nd packet would arrive while the state machine is not expecting it).   Frames not remembered would continue whatever action was being attempted before the failover without harm.    I suppose that if failover is slow enough, any frames in the transmit queue would be so stale as to be not worth sending.   I question whether this is justification to toss the frames in the transmit queue given that failover may occur fast enough to avoid imposing any staleness on these frames.   A decision to toss outstanding frames means the protocol layer must be adjusted to deal with transmit failure in the absence of tunnel failure - thus e.g. an established session may have to re-submit a CDN.  In one of your comments, you brought up re-transmitting without re-numbering.   I believe this choice will cause the tunnel to fail since such packets will not be acknowledged by the peer causing re-transmission of the packets and eventual tunnel failure.   Having said all of the above, it's probably reasonable that this be an implementation decision (the implementation must either discard the contents of the transmit queue or re-number the frames; the implementation must not transmit without re-numbering).   It may be worthwhile to have a section describing the alternatives and the potential issues with each.

Thanx,

Paul

Vipin Jain wrote:
hi Paul,

my response inline..

  
Reserved Bits: I am in for that. Based on the discussions earlier Mark andwe 
agreed that if we can get it done by using a control plane message 
exchangethere is no need to make a L2TP-header change for this.
    

  
[pwh4] I'm not sure how to interpret this comment.   I agree we don't wantto 
make changes where we can avoid them, but this needs to be balanced 
againstproviding a resource efficient solution.   By creating the concept of 
unnumberedframes to do reliable layer signaling then we can have the reliable
    

  
layerre-sync the control plane without requiring a separate tunnel to do the 
resyncand without requiring a tid change.   IMHO, this would seem to justify 
takinga reserved bit to indicate UI.
    
I don't think introducing the concept of having unnumbered messages and
changing header bits is a good idea. It would be bring in following problems:
- How do you ack an unumbered message? More importantly how do you relibaly
transmit them? Do they take same transmit queue and apply with rx window
constrains?
- What if a node is bombarded with such unnumbered messages? Are we suppose to
interpret each one of them?

  
Using Ns=0, Nr=0 Scheme: This is flawed because it might naturally 
coincidewith the sequence numbers of a tunnel thereby confusing this as 
anacknowledgement for a packet sent earlier.
[pwh4] I don't think so.   With unnumbered frames, there is no Ns/Nr thusthey
    

  
are ignored.   I mentioned 0/0 only because they should probably beset to 
something.   The UI bit would be looked at by the reliable layer beforeit 
looks at Ns/Nr.   UI frames would be handled entirely by the reliablelayer 
and not the protocol layer.
    
Using Ns=0, Nr=0 when UI bit is set invites interpreting any message with Nr=0,
Ns=0 and UI bit set - Does it invite a DoS attack?  
      
  
One negativeto this approach is that it would require the systems to have the
resourcesavailable to at least temporarily manage the additional tunnels -
this mightimply being able to temporarily peak at double the normally 
supported numberof tunnels.   If the non-failed endpoint was at it's maximum 
number of tunnels,how would it know to make an exception for the failover 
tunnel setup?    
    
    
  
And therefore we would RECOMMEND keeping space for at least tunnel for 
recoverypurposes. If it can't establish the tunnel, then it could retry 'x' 
timesbefore concluding the tunnel recovery mechanism failed. This is much 
betterthan current proposal where we'd establish one new tunnel for every old
    

  
tunnel.  
    

  
[pwh4] If we want recovery to proceed expeditiously (and we really only havea
    

  
maximum of 1 hello timeout plus max retransmit timeout), then we reallyneed 
to be able to recovery tunnels in parallel.   This could be very 
resourceintensive.   It doesn't seem likely that a large number of tunnels 
will necessarilybe coming from the same peer.    I typically see not more 
than a small handful(say 3) tunnels coming from the same peer to allow for 
different servicepolicies.   With this scenario, I'd need an additional 33% 
resources to recoverthe tunnels in parallel - that's a rather steep price to 
pay. 
    
Parallel recovery was definitely one of the design goals (Appendix A.2); So
inline with your thinking if we wish to recover we'd need to reserve the
resources. Having a three-way dialogue (to reset one another's control plane)
is a MUST. Now, doing that on the existing tunnel is what we are evaluating.

My proposal is that using the existing mechanism in the draft if we reset the
old tunnel's control plane thereby keeping the data plane hitless (because old
tunnel-id is intact) is something workable and fits in existing constructs of
tunnel establishment, including individually authenticating peers upon restart.

  
Howdoes the transition to the new sequence number occur? I  presume we 
handoff the new sequence numbers to the reliable layer andit then purges 
it'sre-ordering rx queue, renumbers any outstanding,transmits and 
immediatelyre-transmits them?  Any stale receives get automatically discarded
    

  
as outof window.    
    
- Renumbering outstanding transmits might create more unpredictability for 
nobenefit. Because control plane on the failed node would have lost the 
contextrelated to previous messages, it is better to flush everything off and
    

  
startwith new sequence numbers.
    

  
[pwh4] I disagree.   Say the outstanding transmit is an ICCN.   The 
failednode may or may not have remembered sending the ICRP.   By re-sending 
theICCN, we allow the failed node at least the option of continuing the 
setup.  
    
The target of the draft was to recover only the sessions that were in
established state. This means if an endpoint does not keep track of session's
intermediate state (i.e. ICRP sent, or awaiting ICCN) then it doesn't matter if
the other sends an ICCN or not, it would be discarded upon control plane
restart for the tunnel.

  
The state machines at the protocol layer should already be capable of 
handlingan unexpected ICCN (or for that matter any unexpected packet).    If 
we wereto throw away these packets, then we're potentially requiring 
additionalcomplication at the protocol layer.   
    
Agre; State Machine should be able to handle any packet in a life of a session
or a tunnel. Regarding additional complication:
- I think section 2.3.4 addresses the inconsistency among sesison states.
- Renumbering the exisitng messages in transmit queue is not going to eliminate
the conditions that could result in the situations described in 2.3.4, so that
needs to be there anyways. Then why bother resending these messages?
- If we are reovering only the sessions that were in established state then
there is no need to retransmit messages for situations that could be handled
otherwise by defined mechanisms.

  
I don't know how your protocol layeris implemented, but mine treats
the reliable layer as a pipe.   
What I putinto it is guaranteed to be delivered or I get a tunnel failed 
indication.  It doesn't seem that I'd really want a tunnel failed indication 
here, butthat would be my only choice if the reliable layer threw away a 
packet thathad been submitted for transmit.
    
>From what I have seen, the reliable layer typically is like a pipe which will
either reliably deliver a packet or provide with a tunnel failure indication.
Therefore it makes doesn't make a difference if we renumber them or not from
delivery perspective. Once queued they'll be delivered. But my point was - why
even remark their sequence numbers?
  
  
The old tunnel becomes active only upon getting confirmation from its peer 
thatit has reset control plane sequence numbers. So this would have to be a 
threeway handshake as described below. For example, for an old tunnel:- 
Failed node sends: "Reset Nr to 5665 for Old Local Tid = 9, Old Tid = 6".- 
Non Failed node responds: "Nr for Old Tid = 9 reset to 5665, Reset Nr to 
4435for the same tunnel (i.e. Old Local Tid = 6, Old Tid = 9 from this 
node'sperspective)". Failed endpoint upon getting this message must first 
enqueue theresponse of this message and then start sending control messages 
on the Oldtunnel.- Failed node sends: "Nr for Old Tid = 9 reset to 4435". Non
    

  
failed node upongetting this can start sending control messages.  
    

  
[pwh4] I presume in the case where control messages with the new 
sequencenumber space start arriving at the non-failed node before the final 
resyncmessage, the messages get discarded as OOW.   The failed node will then
    

  
re-transmitand they'll be accepted once the non-failed node gets the final 
resync message.
    
What we discuss above could work. However, to make it simpler I think simply
resetting the control plane of old tunnel would be good enough.
  

thanks,
-- vipin
















e

__________________________________
Do you Yahoo!?
Protect your identity with Yahoo! Mail AddressGuard
http://antispam.yahoo.com/whatsnewfree

  

_______________________________________________ L2tpext mailing list L2tpext@ietf.org https://www1.ietf.org/mailman/listinfo/l2tpext From l2tpext-admin@ietf.org Tue Nov 18 14:45:29 2003 Received: from optimus.ietf.org ([132.151.1.19]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id OAA09590 for ; Tue, 18 Nov 2003 14:45:29 -0500 (EST) Received: from localhost.localdomain ([127.0.0.1] helo=www1.ietf.org) by optimus.ietf.org with esmtp (Exim 4.20) id 1AMBmY-0002S2-Pm; Tue, 18 Nov 2003 14:45:02 -0500 Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org) by optimus.ietf.org with esmtp (Exim 4.20) id 1AMBmQ-0002RU-ET for l2tpext@optimus.ietf.org; Tue, 18 Nov 2003 14:44:54 -0500 Received: from ietf-mx (ietf-mx.ietf.org [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id OAA09522 for ; Tue, 18 Nov 2003 14:44:41 -0500 (EST) Received: from ietf-mx ([132.151.6.1]) by ietf-mx with esmtp (Exim 4.12) id 1AMBmN-00048L-00 for L2tpext@ietf.org; Tue, 18 Nov 2003 14:44:51 -0500 Received: from web41303.mail.yahoo.com ([66.218.93.52]) by ietf-mx with smtp (Exim 4.12) id 1AMBmM-00047g-00 for L2tpext@ietf.org; Tue, 18 Nov 2003 14:44:50 -0500 Message-ID: <20031118194419.9873.qmail@web41303.mail.yahoo.com> Received: from [66.17.149.13] by web41303.mail.yahoo.com via HTTP; Tue, 18 Nov 2003 11:44:19 PST Date: Tue, 18 Nov 2003 11:44:19 -0800 (PST) From: Vipin Jain To: Paul Howard Cc: l2tp ietflist , Keyur Parikh , vipin@riverstonenet.com In-Reply-To: <3FB2B20D.3000205@juniper.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Subject: [L2tpext] Re: Questions about draft-ietf-l2tpext-failover-02.txt Sender: l2tpext-admin@ietf.org Errors-To: l2tpext-admin@ietf.org X-BeenThere: l2tpext@ietf.org X-Mailman-Version: 2.0.12 Precedence: bulk List-Unsubscribe: , List-Id: Layer Two Tunneling Protocol Extensions List-Post: List-Help: List-Subscribe: , Paul, my comments inline within '<> <>' tabs. It seems that the crux of the discussion boils down to whether to use aseparate recovery tunnel for each extant tunnel (as described in the currentdraft) or to do recovery within each tunnel. Any solution needs to allowfor the possibility of a completely hitless data plane (i.e. 0 data packetloss). Discussions so far have focused upon a basic 3 way handshake toperform the recovery of the control connection's sequence numbers regardlessof whether a separate recovery tunnel is used. <> agree, and therefore so far we agree that retaining the old tunnel id (and its control session) is probably the best way to handle it. <> Using a separate recovery tunnel seems to present the following advantages and disadvantages: Plus: - Leverages extant reliable packet delivery of L2TP control connection via separate recovery tunnel <> The mechanism does allow parallel recovery. However, as you point out there is a penalty of resource reservation. <> Minus: - To allow parallelism (necessary to recover in a timely fashion and avoid disconnects due to tunnel hello failures), the endpoints must be capable ofsupporting some number of tunnels over and above their normal operating limits. Given my implementation currently runs up to 8,000 tunnels, I'd be lookingat having to allow transient peaks of 16,000 tunnels for fully parallel recovery. Using less than a 100% resource reserve and a retry mechanism as you haveproposed (thus doing only a subset of the tunnels in parallel) would reducethe resource requirements, but increase the likelihood of tunnels failingdue to increased latency in tunnel recovery. <> Well, I think situation is not that bad. For example, let me consider few resources needed for tunnel allocation and its maintainence: - hardware resource allocation (i.e. to enable data flow on a given tunnel) - Software data structures for a tunnel - Transmit buffers - Tunnel Ids themselves Let me comment on each one individually: - There shouldn't be any hardware resource allocation for the new tunnel (created to restore the old tunnel) because all traffic is control traffic. Also the new tunnel being established for recovery will not have any sessions established on it thereby eliminating the need for the hardware resource. I agree, it is implementation specific, but still it is not a big pitch. - Transmit buffers on the new tunnel have minimal requirements. The tunnel could be torn down as soon as the old tunnel is recovered. An implementation could have all trasnmit buffers related to the old tunnel be freed as soon as it successfully receives the indication that the old tunnel needs recovery. - Software data structures, etc. needs to be allocated. I wonder if per tunnel memory consumption, etc. is that high. Given that there will not be any sessions established on that tunnel, we are limiting the amount of software resources needed for that as well. - Tunnel Ids, that are allocated for recovery are epehmeral and shouldn't be a big road block IMHO. In summary, I'd say these things could be taken care of given the alternatives we face. <> - Unless the semantics are changed to allow two tunnels with the same TIDs, the TIDs of the recovery tunnel must differ from the old tunnel. This implies re-programming of the data plane for all active sessions and, most likely, at least a brief interruption of data flow. Your proposal to reset the old tunnel's sequence numbers instead of replacing the old tunnel with the recovery tunnel would address this issue. It does, however, carry with itadditional packet overhead (due to the required shutdown of the recovery tunnelas opposed to the silent abandonment of the old tunnel). <> I think there is a trade-off, either we can introduce a totally new mechanism or use the existing mechanism that overlays existing mechanism. As you mention, if we use existing mechanism there is some overhead of tunnels getting established and torn down. In fact we faced same question long ago, whether to use existing mechanism or utilize newer mechanisms and decided to go ahead with what we have. <> Using the exisiting tunnel to recover from the failover: Plus: - No additional resources required to allow fully parallel recovery <> As I mentioned earlier, if resources are managed, parallel recovery is possible in other mechanism that was proposed earlier. <> - No reprogramming of the data plane required (in the case of a controlplane only failover) - thus no interruption of the data flow. <> Given that we rectify the draft to accomodate the mechanism to retain old tunnel ids for the sessions, this shouldn't be a problem in what I proposed. > Minus: - Reliable packet delivery mechanisms of the control connection are unavailable to do the recovery handshake (since one side has lost it's knowledge of Ns and Nr). This implies that the handshake must be done in an unnumbered mode. Procedures for reliable delivery and acknowledgement would need to be provided;however, the nature of the 3 way handshake deals with most of these issues. <> In general my worry is: introducing unnumbered packets on control plane for tunnel recovery introduces a new dimension to how reliable layer fundamentally works. Changing that now might be a little late, given the number of impelementations. More comments inline.. <> You mentioned specific concerns about unnumbered mode in your last response. - Acknowledgements - A possible approach is that all but the last packet of the 3 way handshake is handled by the next packet in the handshake; the last packet in the handshake delivers the last of the sequence number reset data and thus can be acknowledged by requiring a normal ZLB Ack upon receipt. <> To Ack a given packet, it must be numbered I suppose, which means you are proposing we run a parallel train of sequence numbers on the same control plane? Isn't it like running two different tunnels over same tunnel, will it not require fundamental changes to the reliable layer implementations? And is it really worth running two set of transmit and receive buffers per tunnel to be able to support failure recovery? Will it not chew up resources? <> - Reliable transmit - It seems that retransmits of outstanding handshake frames pending acknowledgement would handle this issue. <> Does this mean you are proposing we have multiple timers for retransmit, one for regular traffic, another one for unnumbered frames? Does it not look too kludgy? <> - Transmit queue - The normal transmit queue of the control connection is out of commission pending re-sync of the sequence numbers. Any packets onthis queue would be held pending conclusion of the resync. <> Still, there have to be two separate (logically at least) transmit queues? <> - Receive window constraints - The receive window is also out of commission pending re-sync of the sequence numbers thus it's not clear how it could evenbe applied. The 3 way handshake does in itself effectively enforce a flowcontrol with an RWS of 1. This shouldn't be an issue since nothing elsecan happen on the tunnel's control connection pending resync. <> same concern as above here <> - Bombarded with unnumbered traffic - This could happen with any frame (numbered or unnumbered) and a non-well behaved peer (or hacker). The control connection would have to do at least some packet examination to discard (but it does for bogus numbered mode frames as well). If a more efficient discard of such frames was an issue, then an exchange of re-sync cookies as part of theinitial control connection setup would allow more efficient discard of bogusframes and allow preliminary validation of a resync request (all unnumbered mode frames would be required to carrry the appropriate resync cookie). <> I agree, control packet would have to do some packet examination even for numbered traffic, but IMO it is much better to reject it looking at the header than to do the same after having parsed the AVPs. Where do you plan to place the 'cookie' field in? If not in header we mya not achieve the objective. <> <> IMO, if we rectify the existing mechanism to allow retaining the tunnel-id (the original issue that you had brought up - a very valid one, which I think we should take care of) then we'd achieve hitless data plane behavior as suggested by you in the beginning. I'll comment on the rest after settling on other issues after settling on these ones :-) thanks, -- vipin <> - DOS attack - I'm assuming a hacker with no ability to snoop (if the hacker can snoop, then they can just send the appropriate StopCCN and any protections on the resync mechanism or issues with numbered vs unnumbered mode are moot). Up to the point of the hacker guessing source IP, dest IP, and TID there don't seem to be any difference in the susceptabilty of numbered vs unnumbered mode. After this point, numbered mode has the advantage of requiring the proper guess for Ns. The use of a resync cookie (or more generically a unnumberedmode cookie) in all unnumbered frames would provide an equivalent level ofprotection for unnumbered mode. Other outstanding issues that I see in the e-mail thread: - What to do about frames on the transmit queue of the control connection? I had suggested renumbering them once the new sequence numbers had been established. You suggested discarding all of the frames. I think either approach will work. I'm concerned about some of the issues arising from a decision to discard - primarily the impact on the protocol layer which canno longer submit frames and expect them to be unconditionally delivered baringa tunnel failure. Consider an established session and the non failed endpointhas just queued a CDN for transmit when a failover occurs. Both endpointsthink the session is established, but with a discard of the pending transmits,the CDN never gets sent. Now the FSQ/FSR mechanism would detect this; however,the discard of the CDN may have tossed information of interest to the peer(e.g disconnect cause from RFC 3145). I guess I don't seem the harm inre-numbering upon resync. Any frames in the transmit queue are by definitionoutstanding (some may not even have been sent yet due to flow control). The frames may or may not have been received at the peer before the failover(and if received may or may not have been remembered). The worst case isthe frame was received and remembered at which point the protocol layer atthe peer will get a second copy of the packet (the control connection duplicateelimination won't catch it) and the session will most likely get torn downas a result (since the 2nd packet would arrive while the state machine isnot expecting it). Frames not remembered would continue whatever actionwas being attempted before the failover without harm. I suppose that iffailover is slow enough, any frames in the transmit queue would be so staleas to be not worth sending. I question whether this is justification totoss the frames in the transmit queue given that failover may occur fastenough to avoid imposing any staleness on these frames. A decision to tossoutstanding frames means the protocol layer must be adjusted to deal withtransmit failure in the absence of tunnel failure - thus e.g. an establishedsession may have to re-submit a CDN. In one of your comments, you broughtup re-transmitting without re-numbering. I believe this choice will causethe tunnel to fail since such packets will not be acknowledged by the peercausing re-transmission of the packets and eventual tunnel failure. Havingsaid all of the above, it's probably reasonable that this be an implementationdecision (the implementation must either discard the contents of the transmitqueue or re-number the frames; the implementation must not transmit withoutre-numbering). It may be worthwhile to have a section describing the alternativesand the potential issues with each. Thanx, Paul Vipin Jain wrote: hi Paul,my response inline.. Reserved Bits: I am in for that. Based on the discussions earlier Mark andwe agreed that if we can get it done by using a control plane message exchangethere is no need to make a L2TP-header change for this. [pwh4] I'm not sure how to interpret this comment. I agree we don't wantto make changes where we can avoid them, but this needs to be balanced againstproviding a resource efficient solution. By creating the concept of unnumberedframes to do reliable layer signaling then we can have the reliable layerre-sync the control plane without requiring a separate tunnel to do the resyncand without requiring a tid change. IMHO, this would seem to justify takinga reserved bit to indicate UI. I don't think introducing the concept of having unnumbered messages andchanging header bits is a good idea. It would be bring in following problems:- How do you ack an unumbered message? More importantly how do you relibalytransmit them? Do they take same transmit queue and apply with rx windowconstrains?- What if a node is bombarded with such unnumbered messages? Are we suppose tointerpret each one of them? Using Ns=0, Nr=0 Scheme: This is flawed because it might naturally coincidewith the sequence numbers of a tunnel thereby confusing this as anacknowledgement for a packet sent earlier.[pwh4] I don't think so. With unnumbered frames, there is no Ns/Nr thusthey are ignored. I mentioned 0/0 only because they should probably beset to something. The UI bit would be looked at by the reliable layer beforeit looks at Ns/Nr. UI frames would be handled entirely by the reliablelayer and not the protocol layer. Using Ns=0, Nr=0 when UI bit is set invites interpreting any message with Nr=0,Ns=0 and UI bit set - Does it invite a DoS attack? One negativeto this approach is that it would require the systems to have theresourcesavailable to at least temporarily manage the additional tunnels -this mightimply being able to temporarily peak at double the normally supported numberof tunnels. If the non-failed endpoint was at it's maximum number of tunnels,how would it know to make an exception for the failover tunnel setup? And therefore we would RECOMMEND keeping space for at least tunnel for recoverypurposes. If it can't establish the tunnel, then it could retry 'x' timesbefore concluding the tunnel recovery mechanism failed. This is much betterthan current proposal where we'd establish one new tunnel for every old tunnel. [pwh4] If we want recovery to proceed expeditiously (and we really only havea maximum of 1 hello timeout plus max retransmit timeout), then we reallyneed to be able to recovery tunnels in parallel. This could be very resourceintensive. It doesn't seem likely that a large number of tunnels will necessarilybe coming from the same peer. I typically see not more than a small handful(say 3) tunnels coming from the same peer to allow for different servicepolicies. With this scenario, I'd need an additional 33% resources to recoverthe tunnels in parallel - that's a rather steep price to pay. Parallel recovery was definitely one of the design goals (Appendix A.2); Soinline with your thinking if we wish to recover we'd need to reserve theresources. Having a three-way dialogue (to reset one another's control plane)is a MUST. Now, doing that on the existing tunnel is what we are evaluating.My proposal is that using the existing mechanism in the draft if we reset theold tunnel's control plane thereby keeping the data plane hitless (because oldtunnel-id is intact) is something workable and fits in existing constructs oftunnel establishment, including individually authenticating peers upon restart. Howdoes the transition to the new sequence number occur? I presume we handoff the new sequence numbers to the reliable layer andit then purges it'sre-ordering rx queue, renumbers any outstanding,transmits and immediatelyre-transmits them? Any stale receives get automatically discarded as outof window. - Renumbering outstanding transmits might create more unpredictability for nobenefit. Because control plane on the failed node would have lost the contextrelated to previous messages, it is better to flush everything off and startwith new sequence numbers. [pwh4] I disagree. Say the outstanding transmit is an ICCN. The failednode may or may not have remembered sending the ICRP. By re-sending theICCN, we allow the failed node at least the option of continuing the setup. The target of the draft was to recover only the sessions that were inestablished state. This means if an endpoint does not keep track of session'sintermediate state (i.e. ICRP sent, or awaiting ICCN) then it doesn't matter ifthe other sends an ICCN or not, it would be discarded upon control planerestart for the tunnel. The state machines at the protocol layer should already be capable of handlingan unexpected ICCN (or for that matter any unexpected packet). If we wereto throw away these packets, then we're potentially requiring additionalcomplication at the protocol layer. Agre; State Machine should be able to handle any packet in a life of a sessionor a tunnel. Regarding additional complication:- I think section 2.3.4 addresses the inconsistency among sesison states.- Renumbering the exisitng messages in transmit queue is not going to eliminatethe conditions that could result in the situations described in 2.3.4, so thatneeds to be there anyways. Then why bother resending these messages?- If we are reovering only the sessions that were in established state thenthere is no need to retransmit messages for situations that could be handledotherwise by defined mechanisms. I don't know how your protocol layeris implemented, but mine treatsthe reliable layer as a pipe. What I putinto it is guaranteed to be delivered or I get a tunnel failed indication. It doesn't seem that I'd really want a tunnel failed indication here, butthat would be my only choice if the reliable layer threw away a packet thathad been submitted for transmit. >From what I have seen, the reliable layer typically is like a pipe which willeither reliably deliver a packet or provide with a tunnel failure indication.Therefore it makes doesn't make a difference if we renumber them or not fromdelivery perspective. Once queued they'll be delivered. But my point was - whyeven remark their sequence numbers? The old tunnel becomes active only upon getting confirmation from its peer thatit has reset control plane sequence numbers. So this would have to be a threeway handshake as described below. For example, for an old tunnel:- Failed node sends: "Reset Nr to 5665 for Old Local Tid = 9, Old Tid = 6".- Non Failed node responds: "Nr for Old Tid = 9 reset to 5665, Reset Nr to 4435for the same tunnel (i.e. Old Local Tid = 6, Old Tid = 9 from this node'sperspective)". Failed endpoint upon getting this message must first enqueue theresponse of this message and then start sending control messages on the Oldtunnel.- Failed node sends: "Nr for Old Tid = 9 reset to 4435". Non failed node upongetting this can start sending control messages. [pwh4] I presume in the case where control messages with the new sequencenumber space start arriving at the non-failed node before the final resyncmessage, the messages get discarded as OOW. The failed node will then re-transmitand they'll be accepted once the non-failed node gets the final resync message. What we discuss above could work. However, to make it simpler I think simplyresetting the control plane of old tunnel would be good enough. thanks,-- vipine__________________________________Do you Yahoo!?Protect your identity with Yahoo! Mail AddressGuardhttp://antispam.yahoo.com/whatsnewfree __________________________________ Do you Yahoo!? Protect your identity with Yahoo! Mail AddressGuard http://antispam.yahoo.com/whatsnewfree _______________________________________________ L2tpext mailing list L2tpext@ietf.org https://www1.ietf.org/mailman/listinfo/l2tpext From exim@www1.ietf.org Tue Nov 18 14:45:33 2003 Received: from optimus.ietf.org ([132.151.1.19]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id OAA09622 for ; Tue, 18 Nov 2003 14:45:33 -0500 (EST) Received: from localhost.localdomain ([127.0.0.1] helo=www1.ietf.org) by optimus.ietf.org with esmtp (Exim 4.20) id 1AMBmj-0002U7-Py for l2tpext-archive@odin.ietf.org; Tue, 18 Nov 2003 14:45:15 -0500 Received: (from exim@localhost) by www1.ietf.org (8.12.8/8.12.8/Submit) id hAIJjD3E009539 for l2tpext-archive@odin.ietf.org; Tue, 18 Nov 2003 14:45:13 -0500 Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org) by optimus.ietf.org with esmtp (Exim 4.20) id 1AMBmj-0002Th-JX for l2tpext-web-archive@optimus.ietf.org; Tue, 18 Nov 2003 14:45:13 -0500 Received: from ietf-mx (ietf-mx.ietf.org [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id OAA09548 for ; Tue, 18 Nov 2003 14:45:00 -0500 (EST) Received: from ietf-mx ([132.151.6.1]) by ietf-mx with esmtp (Exim 4.12) id 1AMBmg-00048r-00 for l2tpext-web-archive@ietf.org; Tue, 18 Nov 2003 14:45:10 -0500 Received: from [132.151.1.19] (helo=optimus.ietf.org) by ietf-mx with esmtp (Exim 4.12) id 1AMBmf-00048h-00 for l2tpext-web-archive@ietf.org; Tue, 18 Nov 2003 14:45:09 -0500 Received: from localhost.localdomain ([127.0.0.1] helo=www1.ietf.org) by optimus.ietf.org with esmtp (Exim 4.20) id 1AMBmY-0002S2-Pm; Tue, 18 Nov 2003 14:45:02 -0500 Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org) by optimus.ietf.org with esmtp (Exim 4.20) id 1AMBmQ-0002RU-ET for l2tpext@optimus.ietf.org; Tue, 18 Nov 2003 14:44:54 -0500 Received: from ietf-mx (ietf-mx.ietf.org [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id OAA09522 for ; Tue, 18 Nov 2003 14:44:41 -0500 (EST) Received: from ietf-mx ([132.151.6.1]) by ietf-mx with esmtp (Exim 4.12) id 1AMBmN-00048L-00 for L2tpext@ietf.org; Tue, 18 Nov 2003 14:44:51 -0500 Received: from web41303.mail.yahoo.com ([66.218.93.52]) by ietf-mx with smtp (Exim 4.12) id 1AMBmM-00047g-00 for L2tpext@ietf.org; Tue, 18 Nov 2003 14:44:50 -0500 Message-ID: <20031118194419.9873.qmail@web41303.mail.yahoo.com> Received: from [66.17.149.13] by web41303.mail.yahoo.com via HTTP; Tue, 18 Nov 2003 11:44:19 PST Date: Tue, 18 Nov 2003 11:44:19 -0800 (PST) From: Vipin Jain To: Paul Howard Cc: l2tp ietflist , Keyur Parikh , vipin@riverstonenet.com In-Reply-To: <3FB2B20D.3000205@juniper.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Subject: [L2tpext] Re: Questions about draft-ietf-l2tpext-failover-02.txt Sender: l2tpext-admin@ietf.org Errors-To: l2tpext-admin@ietf.org X-BeenThere: l2tpext@ietf.org X-Mailman-Version: 2.0.12 Precedence: bulk List-Unsubscribe: , List-Id: Layer Two Tunneling Protocol Extensions List-Post: List-Help: List-Subscribe: , Paul, my comments inline within '<> <>' tabs. It seems that the crux of the discussion boils down to whether to use aseparate recovery tunnel for each extant tunnel (as described in the currentdraft) or to do recovery within each tunnel. Any solution needs to allowfor the possibility of a completely hitless data plane (i.e. 0 data packetloss). Discussions so far have focused upon a basic 3 way handshake toperform the recovery of the control connection's sequence numbers regardlessof whether a separate recovery tunnel is used. <> agree, and therefore so far we agree that retaining the old tunnel id (and its control session) is probably the best way to handle it. <> Using a separate recovery tunnel seems to present the following advantages and disadvantages: Plus: - Leverages extant reliable packet delivery of L2TP control connection via separate recovery tunnel <> The mechanism does allow parallel recovery. However, as you point out there is a penalty of resource reservation. <> Minus: - To allow parallelism (necessary to recover in a timely fashion and avoid disconnects due to tunnel hello failures), the endpoints must be capable ofsupporting some number of tunnels over and above their normal operating limits. Given my implementation currently runs up to 8,000 tunnels, I'd be lookingat having to allow transient peaks of 16,000 tunnels for fully parallel recovery. Using less than a 100% resource reserve and a retry mechanism as you haveproposed (thus doing only a subset of the tunnels in parallel) would reducethe resource requirements, but increase the likelihood of tunnels failingdue to increased latency in tunnel recovery. <> Well, I think situation is not that bad. For example, let me consider few resources needed for tunnel allocation and its maintainence: - hardware resource allocation (i.e. to enable data flow on a given tunnel) - Software data structures for a tunnel - Transmit buffers - Tunnel Ids themselves Let me comment on each one individually: - There shouldn't be any hardware resource allocation for the new tunnel (created to restore the old tunnel) because all traffic is control traffic. Also the new tunnel being established for recovery will not have any sessions established on it thereby eliminating the need for the hardware resource. I agree, it is implementation specific, but still it is not a big pitch. - Transmit buffers on the new tunnel have minimal requirements. The tunnel could be torn down as soon as the old tunnel is recovered. An implementation could have all trasnmit buffers related to the old tunnel be freed as soon as it successfully receives the indication that the old tunnel needs recovery. - Software data structures, etc. needs to be allocated. I wonder if per tunnel memory consumption, etc. is that high. Given that there will not be any sessions established on that tunnel, we are limiting the amount of software resources needed for that as well. - Tunnel Ids, that are allocated for recovery are epehmeral and shouldn't be a big road block IMHO. In summary, I'd say these things could be taken care of given the alternatives we face. <> - Unless the semantics are changed to allow two tunnels with the same TIDs, the TIDs of the recovery tunnel must differ from the old tunnel. This implies re-programming of the data plane for all active sessions and, most likely, at least a brief interruption of data flow. Your proposal to reset the old tunnel's sequence numbers instead of replacing the old tunnel with the recovery tunnel would address this issue. It does, however, carry with itadditional packet overhead (due to the required shutdown of the recovery tunnelas opposed to the silent abandonment of the old tunnel). <> I think there is a trade-off, either we can introduce a totally new mechanism or use the existing mechanism that overlays existing mechanism. As you mention, if we use existing mechanism there is some overhead of tunnels getting established and torn down. In fact we faced same question long ago, whether to use existing mechanism or utilize newer mechanisms and decided to go ahead with what we have. <> Using the exisiting tunnel to recover from the failover: Plus: - No additional resources required to allow fully parallel recovery <> As I mentioned earlier, if resources are managed, parallel recovery is possible in other mechanism that was proposed earlier. <> - No reprogramming of the data plane required (in the case of a controlplane only failover) - thus no interruption of the data flow. <> Given that we rectify the draft to accomodate the mechanism to retain old tunnel ids for the sessions, this shouldn't be a problem in what I proposed. > Minus: - Reliable packet delivery mechanisms of the control connection are unavailable to do the recovery handshake (since one side has lost it's knowledge of Ns and Nr). This implies that the handshake must be done in an unnumbered mode. Procedures for reliable delivery and acknowledgement would need to be provided;however, the nature of the 3 way handshake deals with most of these issues. <> In general my worry is: introducing unnumbered packets on control plane for tunnel recovery introduces a new dimension to how reliable layer fundamentally works. Changing that now might be a little late, given the number of impelementations. More comments inline.. <> You mentioned specific concerns about unnumbered mode in your last response. - Acknowledgements - A possible approach is that all but the last packet of the 3 way handshake is handled by the next packet in the handshake; the last packet in the handshake delivers the last of the sequence number reset data and thus can be acknowledged by requiring a normal ZLB Ack upon receipt. <> To Ack a given packet, it must be numbered I suppose, which means you are proposing we run a parallel train of sequence numbers on the same control plane? Isn't it like running two different tunnels over same tunnel, will it not require fundamental changes to the reliable layer implementations? And is it really worth running two set of transmit and receive buffers per tunnel to be able to support failure recovery? Will it not chew up resources? <> - Reliable transmit - It seems that retransmits of outstanding handshake frames pending acknowledgement would handle this issue. <> Does this mean you are proposing we have multiple timers for retransmit, one for regular traffic, another one for unnumbered frames? Does it not look too kludgy? <> - Transmit queue - The normal transmit queue of the control connection is out of commission pending re-sync of the sequence numbers. Any packets onthis queue would be held pending conclusion of the resync. <> Still, there have to be two separate (logically at least) transmit queues? <> - Receive window constraints - The receive window is also out of commission pending re-sync of the sequence numbers thus it's not clear how it could evenbe applied. The 3 way handshake does in itself effectively enforce a flowcontrol with an RWS of 1. This shouldn't be an issue since nothing elsecan happen on the tunnel's control connection pending resync. <> same concern as above here <> - Bombarded with unnumbered traffic - This could happen with any frame (numbered or unnumbered) and a non-well behaved peer (or hacker). The control connection would have to do at least some packet examination to discard (but it does for bogus numbered mode frames as well). If a more efficient discard of such frames was an issue, then an exchange of re-sync cookies as part of theinitial control connection setup would allow more efficient discard of bogusframes and allow preliminary validation of a resync request (all unnumbered mode frames would be required to carrry the appropriate resync cookie). <> I agree, control packet would have to do some packet examination even for numbered traffic, but IMO it is much better to reject it looking at the header than to do the same after having parsed the AVPs. Where do you plan to place the 'cookie' field in? If not in header we mya not achieve the objective. <> <> IMO, if we rectify the existing mechanism to allow retaining the tunnel-id (the original issue that you had brought up - a very valid one, which I think we should take care of) then we'd achieve hitless data plane behavior as suggested by you in the beginning. I'll comment on the rest after settling on other issues after settling on these ones :-) thanks, -- vipin <> - DOS attack - I'm assuming a hacker with no ability to snoop (if the hacker can snoop, then they can just send the appropriate StopCCN and any protections on the resync mechanism or issues with numbered vs unnumbered mode are moot). Up to the point of the hacker guessing source IP, dest IP, and TID there don't seem to be any difference in the susceptabilty of numbered vs unnumbered mode. After this point, numbered mode has the advantage of requiring the proper guess for Ns. The use of a resync cookie (or more generically a unnumberedmode cookie) in all unnumbered frames would provide an equivalent level ofprotection for unnumbered mode. Other outstanding issues that I see in the e-mail thread: - What to do about frames on the transmit queue of the control connection? I had suggested renumbering them once the new sequence numbers had been established. You suggested discarding all of the frames. I think either approach will work. I'm concerned about some of the issues arising from a decision to discard - primarily the impact on the protocol layer which canno longer submit frames and expect them to be unconditionally delivered baringa tunnel failure. Consider an established session and the non failed endpointhas just queued a CDN for transmit when a failover occurs. Both endpointsthink the session is established, but with a discard of the pending transmits,the CDN never gets sent. Now the FSQ/FSR mechanism would detect this; however,the discard of the CDN may have tossed information of interest to the peer(e.g disconnect cause from RFC 3145). I guess I don't seem the harm inre-numbering upon resync. Any frames in the transmit queue are by definitionoutstanding (some may not even have been sent yet due to flow control). The frames may or may not have been received at the peer before the failover(and if received may or may not have been remembered). The worst case isthe frame was received and remembered at which point the protocol layer atthe peer will get a second copy of the packet (the control connection duplicateelimination won't catch it) and the session will most likely get torn downas a result (since the 2nd packet would arrive while the state machine isnot expecting it). Frames not remembered would continue whatever actionwas being attempted before the failover without harm. I suppose that iffailover is slow enough, any frames in the transmit queue would be so staleas to be not worth sending. I question whether this is justification totoss the frames in the transmit queue given that failover may occur fastenough to avoid imposing any staleness on these frames. A decision to tossoutstanding frames means the protocol layer must be adjusted to deal withtransmit failure in the absence of tunnel failure - thus e.g. an establishedsession may have to re-submit a CDN. In one of your comments, you broughtup re-transmitting without re-numbering. I believe this choice will causethe tunnel to fail since such packets will not be acknowledged by the peercausing re-transmission of the packets and eventual tunnel failure. Havingsaid all of the above, it's probably reasonable that this be an implementationdecision (the implementation must either discard the contents of the transmitqueue or re-number the frames; the implementation must not transmit withoutre-numbering). It may be worthwhile to have a section describing the alternativesand the potential issues with each. Thanx, Paul Vipin Jain wrote: hi Paul,my response inline.. Reserved Bits: I am in for that. Based on the discussions earlier Mark andwe agreed that if we can get it done by using a control plane message exchangethere is no need to make a L2TP-header change for this. [pwh4] I'm not sure how to interpret this comment. I agree we don't wantto make changes where we can avoid them, but this needs to be balanced againstproviding a resource efficient solution. By creating the concept of unnumberedframes to do reliable layer signaling then we can have the reliable layerre-sync the control plane without requiring a separate tunnel to do the resyncand without requiring a tid change. IMHO, this would seem to justify takinga reserved bit to indicate UI. I don't think introducing the concept of having unnumbered messages andchanging header bits is a good idea. It would be bring in following problems:- How do you ack an unumbered message? More importantly how do you relibalytransmit them? Do they take same transmit queue and apply with rx windowconstrains?- What if a node is bombarded with such unnumbered messages? Are we suppose tointerpret each one of them? Using Ns=0, Nr=0 Scheme: This is flawed because it might naturally coincidewith the sequence numbers of a tunnel thereby confusing this as anacknowledgement for a packet sent earlier.[pwh4] I don't think so. With unnumbered frames, there is no Ns/Nr thusthey are ignored. I mentioned 0/0 only because they should probably beset to something. The UI bit would be looked at by the reliable layer beforeit looks at Ns/Nr. UI frames would be handled entirely by the reliablelayer and not the protocol layer. Using Ns=0, Nr=0 when UI bit is set invites interpreting any message with Nr=0,Ns=0 and UI bit set - Does it invite a DoS attack? One negativeto this approach is that it would require the systems to have theresourcesavailable to at least temporarily manage the additional tunnels -this mightimply being able to temporarily peak at double the normally supported numberof tunnels. If the non-failed endpoint was at it's maximum number of tunnels,how would it know to make an exception for the failover tunnel setup? And therefore we would RECOMMEND keeping space for at least tunnel for recoverypurposes. If it can't establish the tunnel, then it could retry 'x' timesbefore concluding the tunnel recovery mechanism failed. This is much betterthan current proposal where we'd establish one new tunnel for every old tunnel. [pwh4] If we want recovery to proceed expeditiously (and we really only havea maximum of 1 hello timeout plus max retransmit timeout), then we reallyneed to be able to recovery tunnels in parallel. This could be very resourceintensive. It doesn't seem likely that a large number of tunnels will necessarilybe coming from the same peer. I typically see not more than a small handful(say 3) tunnels coming from the same peer to allow for different servicepolicies. With this scenario, I'd need an additional 33% resources to recoverthe tunnels in parallel - that's a rather steep price to pay. Parallel recovery was definitely one of the design goals (Appendix A.2); Soinline with your thinking if we wish to recover we'd need to reserve theresources. Having a three-way dialogue (to reset one another's control plane)is a MUST. Now, doing that on the existing tunnel is what we are evaluating.My proposal is that using the existing mechanism in the draft if we reset theold tunnel's control plane thereby keeping the data plane hitless (because oldtunnel-id is intact) is something workable and fits in existing constructs oftunnel establishment, including individually authenticating peers upon restart. Howdoes the transition to the new sequence number occur? I presume we handoff the new sequence numbers to the reliable layer andit then purges it'sre-ordering rx queue, renumbers any outstanding,transmits and immediatelyre-transmits them? Any stale receives get automatically discarded as outof window. - Renumbering outstanding transmits might create more unpredictability for nobenefit. Because control plane on the failed node would have lost the contextrelated to previous messages, it is better to flush everything off and startwith new sequence numbers. [pwh4] I disagree. Say the outstanding transmit is an ICCN. The failednode may or may not have remembered sending the ICRP. By re-sending theICCN, we allow the failed node at least the option of continuing the setup. The target of the draft was to recover only the sessions that were inestablished state. This means if an endpoint does not keep track of session'sintermediate state (i.e. ICRP sent, or awaiting ICCN) then it doesn't matter ifthe other sends an ICCN or not, it would be discarded upon control planerestart for the tunnel. The state machines at the protocol layer should already be capable of handlingan unexpected ICCN (or for that matter any unexpected packet). If we wereto throw away these packets, then we're potentially requiring additionalcomplication at the protocol layer. Agre; State Machine should be able to handle any packet in a life of a sessionor a tunnel. Regarding additional complication:- I think section 2.3.4 addresses the inconsistency among sesison states.- Renumbering the exisitng messages in transmit queue is not going to eliminatethe conditions that could result in the situations described in 2.3.4, so thatneeds to be there anyways. Then why bother resending these messages?- If we are reovering only the sessions that were in established state thenthere is no need to retransmit messages for situations that could be handledotherwise by defined mechanisms. I don't know how your protocol layeris implemented, but mine treatsthe reliable layer as a pipe. What I putinto it is guaranteed to be delivered or I get a tunnel failed indication. It doesn't seem that I'd really want a tunnel failed indication here, butthat would be my only choice if the reliable layer threw away a packet thathad been submitted for transmit. >From what I have seen, the reliable layer typically is like a pipe which willeither reliably deliver a packet or provide with a tunnel failure indication.Therefore it makes doesn't make a difference if we renumber them or not fromdelivery perspective. Once queued they'll be delivered. But my point was - whyeven remark their sequence numbers? The old tunnel becomes active only upon getting confirmation from its peer thatit has reset control plane sequence numbers. So this would have to be a threeway handshake as described below. For example, for an old tunnel:- Failed node sends: "Reset Nr to 5665 for Old Local Tid = 9, Old Tid = 6".- Non Failed node responds: "Nr for Old Tid = 9 reset to 5665, Reset Nr to 4435for the same tunnel (i.e. Old Local Tid = 6, Old Tid = 9 from this node'sperspective)". Failed endpoint upon getting this message must first enqueue theresponse of this message and then start sending control messages on the Oldtunnel.- Failed node sends: "Nr for Old Tid = 9 reset to 4435". Non failed node upongetting this can start sending control messages. [pwh4] I presume in the case where control messages with the new sequencenumber space start arriving at the non-failed node before the final resyncmessage, the messages get discarded as OOW. The failed node will then re-transmitand they'll be accepted once the non-failed node gets the final resync message. What we discuss above could work. However, to make it simpler I think simplyresetting the control plane of old tunnel would be good enough. thanks,-- vipine__________________________________Do you Yahoo!?Protect your identity with Yahoo! Mail AddressGuardhttp://antispam.yahoo.com/whatsnewfree __________________________________ Do you Yahoo!? Protect your identity with Yahoo! Mail AddressGuard http://antispam.yahoo.com/whatsnewfree _______________________________________________ L2tpext mailing list L2tpext@ietf.org https://www1.ietf.org/mailman/listinfo/l2tpext From exim@www1.ietf.org Tue Nov 18 14:45:34 2003 Received: from optimus.ietf.org ([132.151.1.19]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id OAA09623 for ; Tue, 18 Nov 2003 14:45:33 -0500 (EST) Received: from localhost.localdomain ([127.0.0.1] helo=www1.ietf.org) by optimus.ietf.org with esmtp (Exim 4.20) id 1AMBmj-0002U8-QG for l2tpext-archive@odin.ietf.org; Tue, 18 Nov 2003 14:45:15 -0500 Received: (from exim@localhost) by www1.ietf.org (8.12.8/8.12.8/Submit) id hAIJjD5m009541 for l2tpext-archive@odin.ietf.org; Tue, 18 Nov 2003 14:45:13 -0500 Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org) by optimus.ietf.org with esmtp (Exim 4.20) id 1AMBmj-0002Ti-Jo for l2tpext-web-archive@optimus.ietf.org; Tue, 18 Nov 2003 14:45:13 -0500 Received: from ietf-mx (ietf-mx.ietf.org [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id OAA09549 for ; Tue, 18 Nov 2003 14:45:00 -0500 (EST) Received: from ietf-mx ([132.151.6.1]) by ietf-mx with esmtp (Exim 4.12) id 1AMBmg-00048o-00 for l2tpext-web-archive@ietf.org; Tue, 18 Nov 2003 14:45:10 -0500 Received: from [132.151.1.19] (helo=optimus.ietf.org) by ietf-mx with esmtp (Exim 4.12) id 1AMBmf-00048g-00 for l2tpext-web-archive@ietf.org; Tue, 18 Nov 2003 14:45:09 -0500 Received: from localhost.localdomain ([127.0.0.1] helo=www1.ietf.org) by optimus.ietf.org with esmtp (Exim 4.20) id 1AMBmY-0002Rr-2x; Tue, 18 Nov 2003 14:45:02 -0500 Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org) by optimus.ietf.org with esmtp (Exim 4.20) id 1AMBm7-0002Qy-DF for l2tpext@optimus.ietf.org; Tue, 18 Nov 2003 14:44:35 -0500 Received: from ietf-mx (ietf-mx.ietf.org [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id OAA09505 for ; Tue, 18 Nov 2003 14:44:17 -0500 (EST) Received: from ietf-mx ([132.151.6.1]) by ietf-mx with esmtp (Exim 4.12) id 1AMBlz-00047t-00 for L2tpext@ietf.org; Tue, 18 Nov 2003 14:44:27 -0500 Received: from web41310.mail.yahoo.com ([66.218.93.59]) by ietf-mx with smtp (Exim 4.12) id 1AMBly-00047K-00 for L2tpext@ietf.org; Tue, 18 Nov 2003 14:44:26 -0500 Message-ID: <20031118194356.41401.qmail@web41310.mail.yahoo.com> Received: from [66.17.149.13] by web41310.mail.yahoo.com via HTTP; Tue, 18 Nov 2003 11:43:56 PST Date: Tue, 18 Nov 2003 11:43:56 -0800 (PST) From: Vipin Jain To: Paul Howard Cc: l2tp ietflist , Keyur Parikh , vipin@riverstonenet.com In-Reply-To: <3FB2B20D.3000205@juniper.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Subject: [L2tpext] Re: Questions about draft-ietf-l2tpext-failover-02.txt Sender: l2tpext-admin@ietf.org Errors-To: l2tpext-admin@ietf.org X-BeenThere: l2tpext@ietf.org X-Mailman-Version: 2.0.12 Precedence: bulk List-Unsubscribe: , List-Id: Layer Two Tunneling Protocol Extensions List-Post: List-Help: List-Subscribe: , Paul, my comments inline within '<> <>' tabs. It seems that the crux of the discussion boils down to whether to use aseparate recovery tunnel for each extant tunnel (as described in the currentdraft) or to do recovery within each tunnel. Any solution needs to allowfor the possibility of a completely hitless data plane (i.e. 0 data packetloss). Discussions so far have focused upon a basic 3 way handshake toperform the recovery of the control connection's sequence numbers regardlessof whether a separate recovery tunnel is used. <> agree, and therefore so far we agree that retaining the old tunnel id (and its control session) is probably the best way to handle it. <> Using a separate recovery tunnel seems to present the following advantages and disadvantages: Plus: - Leverages extant reliable packet delivery of L2TP control connection via separate recovery tunnel <> The mechanism does allow parallel recovery. However, as you point out there is a penalty of resource reservation. <> Minus: - To allow parallelism (necessary to recover in a timely fashion and avoid disconnects due to tunnel hello failures), the endpoints must be capable ofsupporting some number of tunnels over and above their normal operating limits. Given my implementation currently runs up to 8,000 tunnels, I'd be lookingat having to allow transient peaks of 16,000 tunnels for fully parallel recovery. Using less than a 100% resource reserve and a retry mechanism as you haveproposed (thus doing only a subset of the tunnels in parallel) would reducethe resource requirements, but increase the likelihood of tunnels failingdue to increased latency in tunnel recovery. <> Well, I think situation is not that bad. For example, let me consider few resources needed for tunnel allocation and its maintainence: - hardware resource allocation (i.e. to enable data flow on a given tunnel) - Software data structures for a tunnel - Transmit buffers - Tunnel Ids themselves Let me comment on each one individually: - There shouldn't be any hardware resource allocation for the new tunnel (created to restore the old tunnel) because all traffic is control traffic. Also the new tunnel being established for recovery will not have any sessions established on it thereby eliminating the need for the hardware resource. I agree, it is implementation specific, but still it is not a big pitch. - Transmit buffers on the new tunnel have minimal requirements. The tunnel could be torn down as soon as the old tunnel is recovered. An implementation could have all trasnmit buffers related to the old tunnel be freed as soon as it successfully receives the indication that the old tunnel needs recovery. - Software data structures, etc. needs to be allocated. I wonder if per tunnel memory consumption, etc. is that high. Given that there will not be any sessions established on that tunnel, we are limiting the amount of software resources needed for that as well. - Tunnel Ids, that are allocated for recovery are epehmeral and shouldn't be a big road block IMHO. In summary, I'd say these things could be taken care of given the alternatives we face. <> - Unless the semantics are changed to allow two tunnels with the same TIDs, the TIDs of the recovery tunnel must differ from the old tunnel. This implies re-programming of the data plane for all active sessions and, most likely, at least a brief interruption of data flow. Your proposal to reset the old tunnel's sequence numbers instead of replacing the old tunnel with the recovery tunnel would address this issue. It does, however, carry with itadditional packet overhead (due to the required shutdown of the recovery tunnelas opposed to the silent abandonment of the old tunnel). <> I think there is a trade-off, either we can introduce a totally new mechanism or use the existing mechanism that overlays existing mechanism. As you mention, if we use existing mechanism there is some overhead of tunnels getting established and torn down. In fact we faced same question long ago, whether to use existing mechanism or utilize newer mechanisms and decided to go ahead with what we have. <> Using the exisiting tunnel to recover from the failover: Plus: - No additional resources required to allow fully parallel recovery <> As I mentioned earlier, if resources are managed, parallel recovery is possible in other mechanism that was proposed earlier. <> - No reprogramming of the data plane required (in the case of a controlplane only failover) - thus no interruption of the data flow. <> Given that we rectify the draft to accomodate the mechanism to retain old tunnel ids for the sessions, this shouldn't be a problem in what I proposed. > Minus: - Reliable packet delivery mechanisms of the control connection are unavailable to do the recovery handshake (since one side has lost it's knowledge of Ns and Nr). This implies that the handshake must be done in an unnumbered mode. Procedures for reliable delivery and acknowledgement would need to be provided;however, the nature of the 3 way handshake deals with most of these issues. <> In general my worry is: introducing unnumbered packets on control plane for tunnel recovery introduces a new dimension to how reliable layer fundamentally works. Changing that now might be a little late, given the number of impelementations. More comments inline.. <> You mentioned specific concerns about unnumbered mode in your last response. - Acknowledgements - A possible approach is that all but the last packet of the 3 way handshake is handled by the next packet in the handshake; the last packet in the handshake delivers the last of the sequence number reset data and thus can be acknowledged by requiring a normal ZLB Ack upon receipt. <> To Ack a given packet, it must be numbered I suppose, which means you are proposing we run a parallel train of sequence numbers on the same control plane? Isn't it like running two different tunnels over same tunnel, will it not require fundamental changes to the reliable layer implementations? And is it really worth running two set of transmit and receive buffers per tunnel to be able to support failure recovery? Will it not chew up resources? <> - Reliable transmit - It seems that retransmits of outstanding handshake frames pending acknowledgement would handle this issue. <> Does this mean you are proposing we have multiple timers for retransmit, one for regular traffic, another one for unnumbered frames? Does it not look too kludgy? <> - Transmit queue - The normal transmit queue of the control connection is out of commission pending re-sync of the sequence numbers. Any packets onthis queue would be held pending conclusion of the resync. <> Still, there have to be two separate (logically at least) transmit queues? <> - Receive window constraints - The receive window is also out of commission pending re-sync of the sequence numbers thus it's not clear how it could evenbe applied. The 3 way handshake does in itself effectively enforce a flowcontrol with an RWS of 1. This shouldn't be an issue since nothing elsecan happen on the tunnel's control connection pending resync. <> same concern as above here <> - Bombarded with unnumbered traffic - This could happen with any frame (numbered or unnumbered) and a non-well behaved peer (or hacker). The control connection would have to do at least some packet examination to discard (but it does for bogus numbered mode frames as well). If a more efficient discard of such frames was an issue, then an exchange of re-sync cookies as part of theinitial control connection setup would allow more efficient discard of bogusframes and allow preliminary validation of a resync request (all unnumbered mode frames would be required to carrry the appropriate resync cookie). <> I agree, control packet would have to do some packet examination even for numbered traffic, but IMO it is much better to reject it looking at the header than to do the same after having parsed the AVPs. Where do you plan to place the 'cookie' field in? If not in header we mya not achieve the objective. <> <> IMO, if we rectify the existing mechanism to allow retaining the tunnel-id (the original issue that you had brought up - a very valid one, which I think we should take care of) then we'd achieve hitless data plane behavior as suggested by you in the beginning. I'll comment on the rest after settling on other issues after settling on these ones :-) thanks, -- vipin <> - DOS attack - I'm assuming a hacker with no ability to snoop (if the hacker can snoop, then they can just send the appropriate StopCCN and any protections on the resync mechanism or issues with numbered vs unnumbered mode are moot). Up to the point of the hacker guessing source IP, dest IP, and TID there don't seem to be any difference in the susceptabilty of numbered vs unnumbered mode. After this point, numbered mode has the advantage of requiring the proper guess for Ns. The use of a resync cookie (or more generically a unnumberedmode cookie) in all unnumbered frames would provide an equivalent level ofprotection for unnumbered mode. Other outstanding issues that I see in the e-mail thread: - What to do about frames on the transmit queue of the control connection? I had suggested renumbering them once the new sequence numbers had been established. You suggested discarding all of the frames. I think either approach will work. I'm concerned about some of the issues arising from a decision to discard - primarily the impact on the protocol layer which canno longer submit frames and expect them to be unconditionally delivered baringa tunnel failure. Consider an established session and the non failed endpointhas just queued a CDN for transmit when a failover occurs. Both endpointsthink the session is established, but with a discard of the pending transmits,the CDN never gets sent. Now the FSQ/FSR mechanism would detect this; however,the discard of the CDN may have tossed information of interest to the peer(e.g disconnect cause from RFC 3145). I guess I don't seem the harm inre-numbering upon resync. Any frames in the transmit queue are by definitionoutstanding (some may not even have been sent yet due to flow control). The frames may or may not have been received at the peer before the failover(and if received may or may not have been remembered). The worst case isthe frame was received and remembered at which point the protocol layer atthe peer will get a second copy of the packet (the control connection duplicateelimination won't catch it) and the session will most likely get torn downas a result (since the 2nd packet would arrive while the state machine isnot expecting it). Frames not remembered would continue whatever actionwas being attempted before the failover without harm. I suppose that iffailover is slow enough, any frames in the transmit queue would be so staleas to be not worth sending. I question whether this is justification totoss the frames in the transmit queue given that failover may occur fastenough to avoid imposing any staleness on these frames. A decision to tossoutstanding frames means the protocol layer must be adjusted to deal withtransmit failure in the absence of tunnel failure - thus e.g. an establishedsession may have to re-submit a CDN. In one of your comments, you broughtup re-transmitting without re-numbering. I believe this choice will causethe tunnel to fail since such packets will not be acknowledged by the peercausing re-transmission of the packets and eventual tunnel failure. Havingsaid all of the above, it's probably reasonable that this be an implementationdecision (the implementation must either discard the contents of the transmitqueue or re-number the frames; the implementation must not transmit withoutre-numbering). It may be worthwhile to have a section describing the alternativesand the potential issues with each. Thanx, Paul Vipin Jain wrote: hi Paul,my response inline.. Reserved Bits: I am in for that. Based on the discussions earlier Mark andwe agreed that if we can get it done by using a control plane message exchangethere is no need to make a L2TP-header change for this. [pwh4] I'm not sure how to interpret this comment. I agree we don't wantto make changes where we can avoid them, but this needs to be balanced againstproviding a resource efficient solution. By creating the concept of unnumberedframes to do reliable layer signaling then we can have the reliable layerre-sync the control plane without requiring a separate tunnel to do the resyncand without requiring a tid change. IMHO, this would seem to justify takinga reserved bit to indicate UI. I don't think introducing the concept of having unnumbered messages andchanging header bits is a good idea. It would be bring in following problems:- How do you ack an unumbered message? More importantly how do you relibalytransmit them? Do they take same transmit queue and apply with rx windowconstrains?- What if a node is bombarded with such unnumbered messages? Are we suppose tointerpret each one of them? Using Ns=0, Nr=0 Scheme: This is flawed because it might naturally coincidewith the sequence numbers of a tunnel thereby confusing this as anacknowledgement for a packet sent earlier.[pwh4] I don't think so. With unnumbered frames, there is no Ns/Nr thusthey are ignored. I mentioned 0/0 only because they should probably beset to something. The UI bit would be looked at by the reliable layer beforeit looks at Ns/Nr. UI frames would be handled entirely by the reliablelayer and not the protocol layer. Using Ns=0, Nr=0 when UI bit is set invites interpreting any message with Nr=0,Ns=0 and UI bit set - Does it invite a DoS attack? One negativeto this approach is that it would require the systems to have theresourcesavailable to at least temporarily manage the additional tunnels -this mightimply being able to temporarily peak at double the normally supported numberof tunnels. If the non-failed endpoint was at it's maximum number of tunnels,how would it know to make an exception for the failover tunnel setup? And therefore we would RECOMMEND keeping space for at least tunnel for recoverypurposes. If it can't establish the tunnel, then it could retry 'x' timesbefore concluding the tunnel recovery mechanism failed. This is much betterthan current proposal where we'd establish one new tunnel for every old tunnel. [pwh4] If we want recovery to proceed expeditiously (and we really only havea maximum of 1 hello timeout plus max retransmit timeout), then we reallyneed to be able to recovery tunnels in parallel. This could be very resourceintensive. It doesn't seem likely that a large number of tunnels will necessarilybe coming from the same peer. I typically see not more than a small handful(say 3) tunnels coming from the same peer to allow for different servicepolicies. With this scenario, I'd need an additional 33% resources to recoverthe tunnels in parallel - that's a rather steep price to pay. Parallel recovery was definitely one of the design goals (Appendix A.2); Soinline with your thinking if we wish to recover we'd need to reserve theresources. Having a three-way dialogue (to reset one another's control plane)is a MUST. Now, doing that on the existing tunnel is what we are evaluating.My proposal is that using the existing mechanism in the draft if we reset theold tunnel's control plane thereby keeping the data plane hitless (because oldtunnel-id is intact) is something workable and fits in existing constructs oftunnel establishment, including individually authenticating peers upon restart. Howdoes the transition to the new sequence number occur? I presume we handoff the new sequence numbers to the reliable layer andit then purges it'sre-ordering rx queue, renumbers any outstanding,transmits and immediatelyre-transmits them? Any stale receives get automatically discarded as outof window. - Renumbering outstanding transmits might create more unpredictability for nobenefit. Because control plane on the failed node would have lost the contextrelated to previous messages, it is better to flush everything off and startwith new sequence numbers. [pwh4] I disagree. Say the outstanding transmit is an ICCN. The failednode may or may not have remembered sending the ICRP. By re-sending theICCN, we allow the failed node at least the option of continuing the setup. The target of the draft was to recover only the sessions that were inestablished state. This means if an endpoint does not keep track of session'sintermediate state (i.e. ICRP sent, or awaiting ICCN) then it doesn't matter ifthe other sends an ICCN or not, it would be discarded upon control planerestart for the tunnel. The state machines at the protocol layer should already be capable of handlingan unexpected ICCN (or for that matter any unexpected packet). If we wereto throw away these packets, then we're potentially requiring additionalcomplication at the protocol layer. Agre; State Machine should be able to handle any packet in a life of a sessionor a tunnel. Regarding additional complication:- I think section 2.3.4 addresses the inconsistency among sesison states.- Renumbering the exisitng messages in transmit queue is not going to eliminatethe conditions that could result in the situations described in 2.3.4, so thatneeds to be there anyways. Then why bother resending these messages?- If we are reovering only the sessions that were in established state thenthere is no need to retransmit messages for situations that could be handledotherwise by defined mechanisms. I don't know how your protocol layeris implemented, but mine treatsthe reliable layer as a pipe. What I putinto it is guaranteed to be delivered or I get a tunnel failed indication. It doesn't seem that I'd really want a tunnel failed indication here, butthat would be my only choice if the reliable layer threw away a packet thathad been submitted for transmit. >From what I have seen, the reliable layer typically is like a pipe which willeither reliably deliver a packet or provide with a tunnel failure indication.Therefore it makes doesn't make a difference if we renumber them or not fromdelivery perspective. Once queued they'll be delivered. But my point was - whyeven remark their sequence numbers? The old tunnel becomes active only upon getting confirmation from its peer thatit has reset control plane sequence numbers. So this would have to be a threeway handshake as described below. For example, for an old tunnel:- Failed node sends: "Reset Nr to 5665 for Old Local Tid = 9, Old Tid = 6".- Non Failed node responds: "Nr for Old Tid = 9 reset to 5665, Reset Nr to 4435for the same tunnel (i.e. Old Local Tid = 6, Old Tid = 9 from this node'sperspective)". Failed endpoint upon getting this message must first enqueue theresponse of this message and then start sending control messages on the Oldtunnel.- Failed node sends: "Nr for Old Tid = 9 reset to 4435". Non failed node upongetting this can start sending control messages. [pwh4] I presume in the case where control messages with the new sequencenumber space start arriving at the non-failed node before the final resyncmessage, the messages get discarded as OOW. The failed node will then re-transmitand they'll be accepted once the non-failed node gets the final resync message. What we discuss above could work. However, to make it simpler I think simplyresetting the control plane of old tunnel would be good enough. thanks,-- vipine__________________________________Do you Yahoo!?Protect your identity with Yahoo! Mail AddressGuardhttp://antispam.yahoo.com/whatsnewfree __________________________________ Do you Yahoo!? Protect your identity with Yahoo! Mail AddressGuard http://antispam.yahoo.com/whatsnewfree _______________________________________________ L2tpext mailing list L2tpext@ietf.org https://www1.ietf.org/mailman/listinfo/l2tpext From l2tpext-admin@ietf.org Tue Nov 18 15:32:49 2003 Received: from optimus.ietf.org ([132.151.1.19]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id OAA09591 for ; Tue, 18 Nov 2003 14:45:29 -0500 (EST) Received: from localhost.localdomain ([127.0.0.1] helo=www1.ietf.org) by optimus.ietf.org with esmtp (Exim 4.20) id 1AMBmY-0002Rr-2x; Tue, 18 Nov 2003 14:45:02 -0500 Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org) by optimus.ietf.org with esmtp (Exim 4.20) id 1AMBm7-0002Qy-DF for l2tpext@optimus.ietf.org; Tue, 18 Nov 2003 14:44:35 -0500 Received: from ietf-mx (ietf-mx.ietf.org [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id OAA09505 for ; Tue, 18 Nov 2003 14:44:17 -0500 (EST) Received: from ietf-mx ([132.151.6.1]) by ietf-mx with esmtp (Exim 4.12) id 1AMBlz-00047t-00 for L2tpext@ietf.org; Tue, 18 Nov 2003 14:44:27 -0500 Received: from web41310.mail.yahoo.com ([66.218.93.59]) by ietf-mx with smtp (Exim 4.12) id 1AMBly-00047K-00 for L2tpext@ietf.org; Tue, 18 Nov 2003 14:44:26 -0500 Message-ID: <20031118194356.41401.qmail@web41310.mail.yahoo.com> Received: from [66.17.149.13] by web41310.mail.yahoo.com via HTTP; Tue, 18 Nov 2003 11:43:56 PST Date: Tue, 18 Nov 2003 11:43:56 -0800 (PST) From: Vipin Jain To: Paul Howard Cc: l2tp ietflist , Keyur Parikh , vipin@riverstonenet.com In-Reply-To: <3FB2B20D.3000205@juniper.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Subject: [L2tpext] Re: Questions about draft-ietf-l2tpext-failover-02.txt Sender: l2tpext-admin@ietf.org Errors-To: l2tpext-admin@ietf.org X-BeenThere: l2tpext@ietf.org X-Mailman-Version: 2.0.12 Precedence: bulk List-Unsubscribe: , List-Id: Layer Two Tunneling Protocol Extensions List-Post: List-Help: List-Subscribe: , Paul, my comments inline within '<> <>' tabs. It seems that the crux of the discussion boils down to whether to use aseparate recovery tunnel for each extant tunnel (as described in the currentdraft) or to do recovery within each tunnel. Any solution needs to allowfor the possibility of a completely hitless data plane (i.e. 0 data packetloss). Discussions so far have focused upon a basic 3 way handshake toperform the recovery of the control connection's sequence numbers regardlessof whether a separate recovery tunnel is used. <> agree, and therefore so far we agree that retaining the old tunnel id (and its control session) is probably the best way to handle it. <> Using a separate recovery tunnel seems to present the following advantages and disadvantages: Plus: - Leverages extant reliable packet delivery of L2TP control connection via separate recovery tunnel <> The mechanism does allow parallel recovery. However, as you point out there is a penalty of resource reservation. <> Minus: - To allow parallelism (necessary to recover in a timely fashion and avoid disconnects due to tunnel hello failures), the endpoints must be capable ofsupporting some number of tunnels over and above their normal operating limits. Given my implementation currently runs up to 8,000 tunnels, I'd be lookingat having to allow transient peaks of 16,000 tunnels for fully parallel recovery. Using less than a 100% resource reserve and a retry mechanism as you haveproposed (thus doing only a subset of the tunnels in parallel) would reducethe resource requirements, but increase the likelihood of tunnels failingdue to increased latency in tunnel recovery. <> Well, I think situation is not that bad. For example, let me consider few resources needed for tunnel allocation and its maintainence: - hardware resource allocation (i.e. to enable data flow on a given tunnel) - Software data structures for a tunnel - Transmit buffers - Tunnel Ids themselves Let me comment on each one individually: - There shouldn't be any hardware resource allocation for the new tunnel (created to restore the old tunnel) because all traffic is control traffic. Also the new tunnel being established for recovery will not have any sessions established on it thereby eliminating the need for the hardware resource. I agree, it is implementation specific, but still it is not a big pitch. - Transmit buffers on the new tunnel have minimal requirements. The tunnel could be torn down as soon as the old tunnel is recovered. An implementation could have all trasnmit buffers related to the old tunnel be freed as soon as it successfully receives the indication that the old tunnel needs recovery. - Software data structures, etc. needs to be allocated. I wonder if per tunnel memory consumption, etc. is that high. Given that there will not be any sessions established on that tunnel, we are limiting the amount of software resources needed for that as well. - Tunnel Ids, that are allocated for recovery are epehmeral and shouldn't be a big road block IMHO. In summary, I'd say these things could be taken care of given the alternatives we face. <> - Unless the semantics are changed to allow two tunnels with the same TIDs, the TIDs of the recovery tunnel must differ from the old tunnel. This implies re-programming of the data plane for all active sessions and, most likely, at least a brief interruption of data flow. Your proposal to reset the old tunnel's sequence numbers instead of replacing the old tunnel with the recovery tunnel would address this issue. It does, however, carry with itadditional packet overhead (due to the required shutdown of the recovery tunnelas opposed to the silent abandonment of the old tunnel). <> I think there is a trade-off, either we can introduce a totally new mechanism or use the existing mechanism that overlays existing mechanism. As you mention, if we use existing mechanism there is some overhead of tunnels getting established and torn down. In fact we faced same question long ago, whether to use existing mechanism or utilize newer mechanisms and decided to go ahead with what we have. <> Using the exisiting tunnel to recover from the failover: Plus: - No additional resources required to allow fully parallel recovery <> As I mentioned earlier, if resources are managed, parallel recovery is possible in other mechanism that was proposed earlier. <> - No reprogramming of the data plane required (in the case of a controlplane only failover) - thus no interruption of the data flow. <> Given that we rectify the draft to accomodate the mechanism to retain old tunnel ids for the sessions, this shouldn't be a problem in what I proposed. > Minus: - Reliable packet delivery mechanisms of the control connection are unavailable to do the recovery handshake (since one side has lost it's knowledge of Ns and Nr). This implies that the handshake must be done in an unnumbered mode. Procedures for reliable delivery and acknowledgement would need to be provided;however, the nature of the 3 way handshake deals with most of these issues. <> In general my worry is: introducing unnumbered packets on control plane for tunnel recovery introduces a new dimension to how reliable layer fundamentally works. Changing that now might be a little late, given the number of impelementations. More comments inline.. <> You mentioned specific concerns about unnumbered mode in your last response. - Acknowledgements - A possible approach is that all but the last packet of the 3 way handshake is handled by the next packet in the handshake; the last packet in the handshake delivers the last of the sequence number reset data and thus can be acknowledged by requiring a normal ZLB Ack upon receipt. <> To Ack a given packet, it must be numbered I suppose, which means you are proposing we run a parallel train of sequence numbers on the same control plane? Isn't it like running two different tunnels over same tunnel, will it not require fundamental changes to the reliable layer implementations? And is it really worth running two set of transmit and receive buffers per tunnel to be able to support failure recovery? Will it not chew up resources? <> - Reliable transmit - It seems that retransmits of outstanding handshake frames pending acknowledgement would handle this issue. <> Does this mean you are proposing we have multiple timers for retransmit, one for regular traffic, another one for unnumbered frames? Does it not look too kludgy? <> - Transmit queue - The normal transmit queue of the control connection is out of commission pending re-sync of the sequence numbers. Any packets onthis queue would be held pending conclusion of the resync. <> Still, there have to be two separate (logically at least) transmit queues? <> - Receive window constraints - The receive window is also out of commission pending re-sync of the sequence numbers thus it's not clear how it could evenbe applied. The 3 way handshake does in itself effectively enforce a flowcontrol with an RWS of 1. This shouldn't be an issue since nothing elsecan happen on the tunnel's control connection pending resync. <> same concern as above here <> - Bombarded with unnumbered traffic - This could happen with any frame (numbered or unnumbered) and a non-well behaved peer (or hacker). The control connection would have to do at least some packet examination to discard (but it does for bogus numbered mode frames as well). If a more efficient discard of such frames was an issue, then an exchange of re-sync cookies as part of theinitial control connection setup would allow more efficient discard of bogusframes and allow preliminary validation of a resync request (all unnumbered mode frames would be required to carrry the appropriate resync cookie). <> I agree, control packet would have to do some packet examination even for numbered traffic, but IMO it is much better to reject it looking at the header than to do the same after having parsed the AVPs. Where do you plan to place the 'cookie' field in? If not in header we mya not achieve the objective. <> <> IMO, if we rectify the existing mechanism to allow retaining the tunnel-id (the original issue that you had brought up - a very valid one, which I think we should take care of) then we'd achieve hitless data plane behavior as suggested by you in the beginning. I'll comment on the rest after settling on other issues after settling on these ones :-) thanks, -- vipin <> - DOS attack - I'm assuming a hacker with no ability to snoop (if the hacker can snoop, then they can just send the appropriate StopCCN and any protections on the resync mechanism or issues with numbered vs unnumbered mode are moot). Up to the point of the hacker guessing source IP, dest IP, and TID there don't seem to be any difference in the susceptabilty of numbered vs unnumbered mode. After this point, numbered mode has the advantage of requiring the proper guess for Ns. The use of a resync cookie (or more generically a unnumberedmode cookie) in all unnumbered frames would provide an equivalent level ofprotection for unnumbered mode. Other outstanding issues that I see in the e-mail thread: - What to do about frames on the transmit queue of the control connection? I had suggested renumbering them once the new sequence numbers had been established. You suggested discarding all of the frames. I think either approach will work. I'm concerned about some of the issues arising from a decision to discard - primarily the impact on the protocol layer which canno longer submit frames and expect them to be unconditionally delivered baringa tunnel failure. Consider an established session and the non failed endpointhas just queued a CDN for transmit when a failover occurs. Both endpointsthink the session is established, but with a discard of the pending transmits,the CDN never gets sent. Now the FSQ/FSR mechanism would detect this; however,the discard of the CDN may have tossed information of interest to the peer(e.g disconnect cause from RFC 3145). I guess I don't seem the harm inre-numbering upon resync. Any frames in the transmit queue are by definitionoutstanding (some may not even have been sent yet due to flow control). The frames may or may not have been received at the peer before the failover(and if received may or may not have been remembered). The worst case isthe frame was received and remembered at which point the protocol layer atthe peer will get a second copy of the packet (the control connection duplicateelimination won't catch it) and the session will most likely get torn downas a result (since the 2nd packet would arrive while the state machine isnot expecting it). Frames not remembered would continue whatever actionwas being attempted before the failover without harm. I suppose that iffailover is slow enough, any frames in the transmit queue would be so staleas to be not worth sending. I question whether this is justification totoss the frames in the transmit queue given that failover may occur fastenough to avoid imposing any staleness on these frames. A decision to tossoutstanding frames means the protocol layer must be adjusted to deal withtransmit failure in the absence of tunnel failure - thus e.g. an establishedsession may have to re-submit a CDN. In one of your comments, you broughtup re-transmitting without re-numbering. I believe this choice will causethe tunnel to fail since such packets will not be acknowledged by the peercausing re-transmission of the packets and eventual tunnel failure. Havingsaid all of the above, it's probably reasonable that this be an implementationdecision (the implementation must either discard the contents of the transmitqueue or re-number the frames; the implementation must not transmit withoutre-numbering). It may be worthwhile to have a section describing the alternativesand the potential issues with each. Thanx, Paul Vipin Jain wrote: hi Paul,my response inline.. Reserved Bits: I am in for that. Based on the discussions earlier Mark andwe agreed that if we can get it done by using a control plane message exchangethere is no need to make a L2TP-header change for this. [pwh4] I'm not sure how to interpret this comment. I agree we don't wantto make changes where we can avoid them, but this needs to be balanced againstproviding a resource efficient solution. By creating the concept of unnumberedframes to do reliable layer signaling then we can have the reliable layerre-sync the control plane without requiring a separate tunnel to do the resyncand without requiring a tid change. IMHO, this would seem to justify takinga reserved bit to indicate UI. I don't think introducing the concept of having unnumbered messages andchanging header bits is a good idea. It would be bring in following problems:- How do you ack an unumbered message? More importantly how do you relibalytransmit them? Do they take same transmit queue and apply with rx windowconstrains?- What if a node is bombarded with such unnumbered messages? Are we suppose tointerpret each one of them? Using Ns=0, Nr=0 Scheme: This is flawed because it might naturally coincidewith the sequence numbers of a tunnel thereby confusing this as anacknowledgement for a packet sent earlier.[pwh4] I don't think so. With unnumbered frames, there is no Ns/Nr thusthey are ignored. I mentioned 0/0 only because they should probably beset to something. The UI bit would be looked at by the reliable layer beforeit looks at Ns/Nr. UI frames would be handled entirely by the reliablelayer and not the protocol layer. Using Ns=0, Nr=0 when UI bit is set invites interpreting any message with Nr=0,Ns=0 and UI bit set - Does it invite a DoS attack? One negativeto this approach is that it would require the systems to have theresourcesavailable to at least temporarily manage the additional tunnels -this mightimply being able to temporarily peak at double the normally supported numberof tunnels. If the non-failed endpoint was at it's maximum number of tunnels,how would it know to make an exception for the failover tunnel setup? And therefore we would RECOMMEND keeping space for at least tunnel for recoverypurposes. If it can't establish the tunnel, then it could retry 'x' timesbefore concluding the tunnel recovery mechanism failed. This is much betterthan current proposal where we'd establish one new tunnel for every old tunnel. [pwh4] If we want recovery to proceed expeditiously (and we really only havea maximum of 1 hello timeout plus max retransmit timeout), then we reallyneed to be able to recovery tunnels in parallel. This could be very resourceintensive. It doesn't seem likely that a large number of tunnels will necessarilybe coming from the same peer. I typically see not more than a small handful(say 3) tunnels coming from the same peer to allow for different servicepolicies. With this scenario, I'd need an additional 33% resources to recoverthe tunnels in parallel - that's a rather steep price to pay. Parallel recovery was definitely one of the design goals (Appendix A.2); Soinline with your thinking if we wish to recover we'd need to reserve theresources. Having a three-way dialogue (to reset one another's control plane)is a MUST. Now, doing that on the existing tunnel is what we are evaluating.My proposal is that using the existing mechanism in the draft if we reset theold tunnel's control plane thereby keeping the data plane hitless (because oldtunnel-id is intact) is something workable and fits in existing constructs oftunnel establishment, including individually authenticating peers upon restart. Howdoes the transition to the new sequence number occur? I presume we handoff the new sequence numbers to the reliable layer andit then purges it'sre-ordering rx queue, renumbers any outstanding,transmits and immediatelyre-transmits them? Any stale receives get automatically discarded as outof window. - Renumbering outstanding transmits might create more unpredictability for nobenefit. Because control plane on the failed node would have lost the contextrelated to previous messages, it is better to flush everything off and startwith new sequence numbers. [pwh4] I disagree. Say the outstanding transmit is an ICCN. The failednode may or may not have remembered sending the ICRP. By re-sending theICCN, we allow the failed node at least the option of continuing the setup. The target of the draft was to recover only the sessions that were inestablished state. This means if an endpoint does not keep track of session'sintermediate state (i.e. ICRP sent, or awaiting ICCN) then it doesn't matter ifthe other sends an ICCN or not, it would be discarded upon control planerestart for the tunnel. The state machines at the protocol layer should already be capable of handlingan unexpected ICCN (or for that matter any unexpected packet). If we wereto throw away these packets, then we're potentially requiring additionalcomplication at the protocol layer. Agre; State Machine should be able to handle any packet in a life of a sessionor a tunnel. Regarding additional complication:- I think section 2.3.4 addresses the inconsistency among sesison states.- Renumbering the exisitng messages in transmit queue is not going to eliminatethe conditions that could result in the situations described in 2.3.4, so thatneeds to be there anyways. Then why bother resending these messages?- If we are reovering only the sessions that were in established state thenthere is no need to retransmit messages for situations that could be handledotherwise by defined mechanisms. I don't know how your protocol layeris implemented, but mine treatsthe reliable layer as a pipe. What I putinto it is guaranteed to be delivered or I get a tunnel failed indication. It doesn't seem that I'd really want a tunnel failed indication here, butthat would be my only choice if the reliable layer threw away a packet thathad been submitted for transmit. >From what I have seen, the reliable layer typically is like a pipe which willeither reliably deliver a packet or provide with a tunnel failure indication.Therefore it makes doesn't make a difference if we renumber them or not fromdelivery perspective. Once queued they'll be delivered. But my point was - whyeven remark their sequence numbers? The old tunnel becomes active only upon getting confirmation from its peer thatit has reset control plane sequence numbers. So this would have to be a threeway handshake as described below. For example, for an old tunnel:- Failed node sends: "Reset Nr to 5665 for Old Local Tid = 9, Old Tid = 6".- Non Failed node responds: "Nr for Old Tid = 9 reset to 5665, Reset Nr to 4435for the same tunnel (i.e. Old Local Tid = 6, Old Tid = 9 from this node'sperspective)". Failed endpoint upon getting this message must first enqueue theresponse of this message and then start sending control messages on the Oldtunnel.- Failed node sends: "Nr for Old Tid = 9 reset to 4435". Non failed node upongetting this can start sending control messages. [pwh4] I presume in the case where control messages with the new sequencenumber space start arriving at the non-failed node before the final resyncmessage, the messages get discarded as OOW. The failed node will then re-transmitand they'll be accepted once the non-failed node gets the final resync message. What we discuss above could work. However, to make it simpler I think simplyresetting the control plane of old tunnel would be good enough. thanks,-- vipine__________________________________Do you Yahoo!?Protect your identity with Yahoo! Mail AddressGuardhttp://antispam.yahoo.com/whatsnewfree __________________________________ Do you Yahoo!? Protect your identity with Yahoo! Mail AddressGuard http://antispam.yahoo.com/whatsnewfree _______________________________________________ L2tpext mailing list L2tpext@ietf.org https://www1.ietf.org/mailman/listinfo/l2tpext From l2tpext-admin@ietf.org Tue Nov 18 17:14:25 2003 Received: from optimus.ietf.org ([132.151.1.19]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id RAA21033 for ; Tue, 18 Nov 2003 17:14:25 -0500 (EST) Received: from localhost.localdomain ([127.0.0.1] helo=www1.ietf.org) by optimus.ietf.org with esmtp (Exim 4.20) id 1AME6j-0004bv-I4; Tue, 18 Nov 2003 17:14:01 -0500 Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org) by optimus.ietf.org with esmtp (Exim 4.20) id 1AEUXK-00009j-D0 for l2tpext@optimus.ietf.org; Tue, 28 Oct 2003 09:09:30 -0500 Received: from CNRI.Reston.VA.US (localhost [127.0.0.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id JAA20913; Tue, 28 Oct 2003 09:09:18 -0500 (EST) Message-Id: <200310281409.JAA20913@ietf.org> Mime-Version: 1.0 Content-Type: Multipart/Mixed; Boundary="NextPart" To: IETF-Announce: ; Cc: l2tpext@ietf.org From: Internet-Drafts@ietf.org Reply-to: Internet-Drafts@ietf.org Date: Tue, 28 Oct 2003 09:09:18 -0500 Subject: [L2tpext] I-D ACTION:draft-ietf-l2tpext-l2tp-base-11.txt Sender: l2tpext-admin@ietf.org Errors-To: l2tpext-admin@ietf.org X-BeenThere: l2tpext@ietf.org X-Mailman-Version: 2.0.12 Precedence: bulk List-Unsubscribe: , List-Id: Layer Two Tunneling Protocol Extensions List-Post: List-Help: List-Subscribe: , --NextPart A New Internet-Draft is available from the on-line Internet-Drafts directories. This draft is a work item of the Layer Two Tunneling Protocol Extensions Working Group of the IETF. Title : Layer Two Tunneling Protocol (Version 3) Author(s) : J. Lau, M. Townsley, I. Goyret Filename : draft-ietf-l2tpext-l2tp-base-11.txt Pages : 88 Date : 2003-10-27 This document describes 'version 3' of the Layer Two Tunneling Protocol (L2TPv3). L2TPv3 defines the base control protocol and encapsulation for tunneling multiple layer 2 connections between two IP connected nodes. Additional documents detail the specifics for each link-type being emulated. A URL for this Internet-Draft is: http://www.ietf.org/internet-drafts/draft-ietf-l2tpext-l2tp-base-11.txt To remove yourself from the IETF Announcement list, send a message to ietf-announce-request with the word unsubscribe in the body of the message. Internet-Drafts are also available by anonymous FTP. Login with the username "anonymous" and a password of your e-mail address. After logging in, type "cd internet-drafts" and then "get draft-ietf-l2tpext-l2tp-base-11.txt". A list of Internet-Drafts directories can be found in http://www.ietf.org/shadow.html or ftp://ftp.ietf.org/ietf/1shadow-sites.txt Internet-Drafts can also be obtained by e-mail. Send a message to: mailserv@ietf.org. In the body type: "FILE /internet-drafts/draft-ietf-l2tpext-l2tp-base-11.txt". NOTE: The mail server at ietf.org can return the document in MIME-encoded form by using the "mpack" utility. To use this feature, insert the command "ENCODING mime" before the "FILE" command. To decode the response(s), you will need "munpack" or a MIME-compliant mail reader. Different MIME-compliant mail readers exhibit different behavior, especially when dealing with "multipart" MIME messages (i.e. documents which have been split up into multiple messages), so check your local documentation on how to manipulate these messages. Below is the data which will enable a MIME compliant mail reader implementation to automatically retrieve the ASCII version of the Internet-Draft. --NextPart Content-Type: Multipart/Alternative; Boundary="OtherAccess" --OtherAccess Content-Type: Message/External-body; access-type="mail-server"; server="mailserv@ietf.org" Content-Type: text/plain Content-ID: <2003-10-28092644.I-D@ietf.org> ENCODING mime FILE /internet-drafts/draft-ietf-l2tpext-l2tp-base-11.txt --OtherAccess Content-Type: Message/External-body; name="draft-ietf-l2tpext-l2tp-base-11.txt"; site="ftp.ietf.org"; access-type="anon-ftp"; directory="internet-drafts" Content-Type: text/plain Content-ID: <2003-10-28092644.I-D@ietf.org> --OtherAccess-- --NextPart-- _______________________________________________ L2tpext mailing list L2tpext@ietf.org https://www1.ietf.org/mailman/listinfo/l2tpext From exim@www1.ietf.org Tue Nov 18 17:14:30 2003 Received: from optimus.ietf.org ([132.151.1.19]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id RAA21067 for ; Tue, 18 Nov 2003 17:14:30 -0500 (EST) Received: from localhost.localdomain ([127.0.0.1] helo=www1.ietf.org) by optimus.ietf.org with esmtp (Exim 4.20) id 1AME6u-0004dl-KN for l2tpext-archive@odin.ietf.org; Tue, 18 Nov 2003 17:14:12 -0500 Received: (from exim@localhost) by www1.ietf.org (8.12.8/8.12.8/Submit) id hAIMECGA017831 for l2tpext-archive@odin.ietf.org; Tue, 18 Nov 2003 17:14:12 -0500 Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org) by optimus.ietf.org with esmtp (Exim 4.20) id 1AME6u-0004dW-FL for l2tpext-web-archive@optimus.ietf.org; Tue, 18 Nov 2003 17:14:12 -0500 Received: from ietf-mx (ietf-mx.ietf.org [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id RAA21021 for ; Tue, 18 Nov 2003 17:13:59 -0500 (EST) Received: from ietf-mx ([132.151.6.1]) by ietf-mx with esmtp (Exim 4.12) id 1AME6s-0000Nq-00 for l2tpext-web-archive@ietf.org; Tue, 18 Nov 2003 17:14:10 -0500 Received: from manatick.foretec.com ([4.17.168.5] helo=manatick) by ietf-mx with esmtp (Exim 4.12) id 1AME6r-0000Nl-00 for l2tpext-web-archive@ietf.org; Tue, 18 Nov 2003 17:14:09 -0500 Received: from [132.151.6.22] (helo=optimus.ietf.org) by manatick with esmtp (Exim 4.24) id 1AME6s-00088W-3D for l2tpext-web-archive@ietf.org; Tue, 18 Nov 2003 17:14:10 -0500 Received: from localhost.localdomain ([127.0.0.1] helo=www1.ietf.org) by optimus.ietf.org with esmtp (Exim 4.20) id 1AME6j-0004c3-Ti; Tue, 18 Nov 2003 17:14:01 -0500 Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org) by optimus.ietf.org with esmtp (Exim 4.20) id 1AH5Zb-0007cJ-KN for l2tpext@optimus.ietf.org; Tue, 04 Nov 2003 13:06:35 -0500 Received: from asgard.ietf.org (asgard.ietf.org [10.27.6.40]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id NAA25583 for ; Tue, 4 Nov 2003 13:06:21 -0500 (EST) Received: from apache by asgard.ietf.org with local (Exim 4.14) id 1AH5Ob-0007sO-SN; Tue, 04 Nov 2003 12:55:13 -0500 X-test-idtracker: no To: IETF-Announce :; Cc: l2tpext@ietf.org From: The IESG Reply-to: iesg@ietf.org Message-Id: Date: Tue, 04 Nov 2003 12:55:13 -0500 Subject: [L2tpext] Last Call: 'Layer Two Tunneling Protocol (Version 3)' to Proposed Standard Sender: l2tpext-admin@ietf.org Errors-To: l2tpext-admin@ietf.org X-BeenThere: l2tpext@ietf.org X-Mailman-Version: 2.0.12 Precedence: bulk List-Unsubscribe: , List-Id: Layer Two Tunneling Protocol Extensions List-Post: List-Help: List-Subscribe: , The IESG has received a request from the Layer Two Tunneling Protocol Extensions WG to consider the following document: - 'Layer Two Tunneling Protocol (Version 3) ' as a Proposed Standard The IESG plans to make a decision in the next few weeks, and solicits final comments on this action. Please send any comments to the iesg@ietf.org or ietf@ietf.org mailing lists by 2003-11-25. The file can be obtained via http://www.ietf.org/internet-drafts/draft-ietf-l2tpext-l2tp-base-11.txt _______________________________________________ L2tpext mailing list L2tpext@ietf.org https://www1.ietf.org/mailman/listinfo/l2tpext From exim@www1.ietf.org Tue Nov 18 17:14:30 2003 Received: from optimus.ietf.org ([132.151.1.19]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id RAA21077 for ; Tue, 18 Nov 2003 17:14:30 -0500 (EST) Received: from localhost.localdomain ([127.0.0.1] helo=www1.ietf.org) by optimus.ietf.org with esmtp (Exim 4.20) id 1AME6v-0004e3-1z for l2tpext-archive@odin.ietf.org; Tue, 18 Nov 2003 17:14:13 -0500 Received: (from exim@localhost) by www1.ietf.org (8.12.8/8.12.8/Submit) id hAIMEDdc017849 for l2tpext-archive@odin.ietf.org; Tue, 18 Nov 2003 17:14:13 -0500 Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org) by optimus.ietf.org with esmtp (Exim 4.20) id 1AME6u-0004do-Ul for l2tpext-web-archive@optimus.ietf.org; Tue, 18 Nov 2003 17:14:12 -0500 Received: from ietf-mx (ietf-mx.ietf.org [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id RAA21024 for ; Tue, 18 Nov 2003 17:13:59 -0500 (EST) Received: from ietf-mx ([132.151.6.1]) by ietf-mx with esmtp (Exim 4.12) id 1AME6s-0000Nv-00 for l2tpext-web-archive@ietf.org; Tue, 18 Nov 2003 17:14:10 -0500 Received: from manatick.foretec.com ([4.17.168.5] helo=manatick) by ietf-mx with esmtp (Exim 4.12) id 1AME6r-0000Nm-00 for l2tpext-web-archive@ietf.org; Tue, 18 Nov 2003 17:14:10 -0500 Received: from [132.151.6.22] (helo=optimus.ietf.org) by manatick with esmtp (Exim 4.24) id 1AME6s-00088X-9i for l2tpext-web-archive@ietf.org; Tue, 18 Nov 2003 17:14:10 -0500 Received: from localhost.localdomain ([127.0.0.1] helo=www1.ietf.org) by optimus.ietf.org with esmtp (Exim 4.20) id 1AME6j-0004bv-I4; Tue, 18 Nov 2003 17:14:01 -0500 Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org) by optimus.ietf.org with esmtp (Exim 4.20) id 1AEUXK-00009j-D0 for l2tpext@optimus.ietf.org; Tue, 28 Oct 2003 09:09:30 -0500 Received: from CNRI.Reston.VA.US (localhost [127.0.0.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id JAA20913; Tue, 28 Oct 2003 09:09:18 -0500 (EST) Message-Id: <200310281409.JAA20913@ietf.org> Mime-Version: 1.0 Content-Type: Multipart/Mixed; Boundary="NextPart" To: IETF-Announce: ; Cc: l2tpext@ietf.org From: Internet-Drafts@ietf.org Reply-to: Internet-Drafts@ietf.org Date: Tue, 28 Oct 2003 09:09:18 -0500 Subject: [L2tpext] I-D ACTION:draft-ietf-l2tpext-l2tp-base-11.txt Sender: l2tpext-admin@ietf.org Errors-To: l2tpext-admin@ietf.org X-BeenThere: l2tpext@ietf.org X-Mailman-Version: 2.0.12 Precedence: bulk List-Unsubscribe: , List-Id: Layer Two Tunneling Protocol Extensions List-Post: List-Help: List-Subscribe: , --NextPart A New Internet-Draft is available from the on-line Internet-Drafts directories. This draft is a work item of the Layer Two Tunneling Protocol Extensions Working Group of the IETF. Title : Layer Two Tunneling Protocol (Version 3) Author(s) : J. Lau, M. Townsley, I. Goyret Filename : draft-ietf-l2tpext-l2tp-base-11.txt Pages : 88 Date : 2003-10-27 This document describes 'version 3' of the Layer Two Tunneling Protocol (L2TPv3). L2TPv3 defines the base control protocol and encapsulation for tunneling multiple layer 2 connections between two IP connected nodes. Additional documents detail the specifics for each link-type being emulated. A URL for this Internet-Draft is: http://www.ietf.org/internet-drafts/draft-ietf-l2tpext-l2tp-base-11.txt To remove yourself from the IETF Announcement list, send a message to ietf-announce-request with the word unsubscribe in the body of the message. Internet-Drafts are also available by anonymous FTP. Login with the username "anonymous" and a password of your e-mail address. After logging in, type "cd internet-drafts" and then "get draft-ietf-l2tpext-l2tp-base-11.txt". A list of Internet-Drafts directories can be found in http://www.ietf.org/shadow.html or ftp://ftp.ietf.org/ietf/1shadow-sites.txt Internet-Drafts can also be obtained by e-mail. Send a message to: mailserv@ietf.org. In the body type: "FILE /internet-drafts/draft-ietf-l2tpext-l2tp-base-11.txt". NOTE: The mail server at ietf.org can return the document in MIME-encoded form by using the "mpack" utility. To use this feature, insert the command "ENCODING mime" before the "FILE" command. To decode the response(s), you will need "munpack" or a MIME-compliant mail reader. Different MIME-compliant mail readers exhibit different behavior, especially when dealing with "multipart" MIME messages (i.e. documents which have been split up into multiple messages), so check your local documentation on how to manipulate these messages. Below is the data which will enable a MIME compliant mail reader implementation to automatically retrieve the ASCII version of the Internet-Draft. --NextPart Content-Type: Multipart/Alternative; Boundary="OtherAccess" --OtherAccess Content-Type: Message/External-body; access-type="mail-server"; server="mailserv@ietf.org" Content-Type: text/plain Content-ID: <2003-10-28092644.I-D@ietf.org> ENCODING mime FILE /internet-drafts/draft-ietf-l2tpext-l2tp-base-11.txt --OtherAccess Content-Type: Message/External-body; name="draft-ietf-l2tpext-l2tp-base-11.txt"; site="ftp.ietf.org"; access-type="anon-ftp"; directory="internet-drafts" Content-Type: text/plain Content-ID: <2003-10-28092644.I-D@ietf.org> --OtherAccess-- --NextPart-- _______________________________________________ L2tpext mailing list L2tpext@ietf.org https://www1.ietf.org/mailman/listinfo/l2tpext From l2tpext-admin@ietf.org Tue Nov 18 18:02:47 2003 Received: from optimus.ietf.org ([132.151.1.19]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id RAA21034 for ; Tue, 18 Nov 2003 17:14:25 -0500 (EST) Received: from localhost.localdomain ([127.0.0.1] helo=www1.ietf.org) by optimus.ietf.org with esmtp (Exim 4.20) id 1AME6j-0004c3-Ti; Tue, 18 Nov 2003 17:14:01 -0500 Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org) by optimus.ietf.org with esmtp (Exim 4.20) id 1AH5Zb-0007cJ-KN for l2tpext@optimus.ietf.org; Tue, 04 Nov 2003 13:06:35 -0500 Received: from asgard.ietf.org (asgard.ietf.org [10.27.6.40]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id NAA25583 for ; Tue, 4 Nov 2003 13:06:21 -0500 (EST) Received: from apache by asgard.ietf.org with local (Exim 4.14) id 1AH5Ob-0007sO-SN; Tue, 04 Nov 2003 12:55:13 -0500 X-test-idtracker: no To: IETF-Announce :; Cc: l2tpext@ietf.org From: The IESG Reply-to: iesg@ietf.org Message-Id: Date: Tue, 04 Nov 2003 12:55:13 -0500 Subject: [L2tpext] Last Call: 'Layer Two Tunneling Protocol (Version 3)' to Proposed Standard Sender: l2tpext-admin@ietf.org Errors-To: l2tpext-admin@ietf.org X-BeenThere: l2tpext@ietf.org X-Mailman-Version: 2.0.12 Precedence: bulk List-Unsubscribe: , List-Id: Layer Two Tunneling Protocol Extensions List-Post: List-Help: List-Subscribe: , The IESG has received a request from the Layer Two Tunneling Protocol Extensions WG to consider the following document: - 'Layer Two Tunneling Protocol (Version 3) ' as a Proposed Standard The IESG plans to make a decision in the next few weeks, and solicits final comments on this action. Please send any comments to the iesg@ietf.org or ietf@ietf.org mailing lists by 2003-11-25. The file can be obtained via http://www.ietf.org/internet-drafts/draft-ietf-l2tpext-l2tp-base-11.txt _______________________________________________ L2tpext mailing list L2tpext@ietf.org https://www1.ietf.org/mailman/listinfo/l2tpext