Received: from segue.merit.edu (segue.merit.edu [198.108.1.41]) by nic.merit.edu (8.9.3/8.9.1) with ESMTP id UAA10980 for ; Tue, 27 Jun 2000 20:44:49 -0400 (EDT) Received: by segue.merit.edu (Postfix) id 098725E206; Tue, 27 Jun 2000 19:46:17 -0400 (EDT) Delivered-To: idr-outgoing@merit.edu Received: by segue.merit.edu (Postfix, from userid 56) id 12AA05E3A4; Tue, 27 Jun 2000 19:08:53 -0400 (EDT) Received: from c000.snv.cp.net (c000-h000.c000.snv.cp.net [209.228.32.64]) by segue.merit.edu (Postfix) with SMTP id B76D05DE9E for ; Tue, 27 Jun 2000 18:28:48 -0400 (EDT) Received: (cpmta 22550 invoked from network); 27 Jun 2000 15:28:38 -0700 Received: from dhcp182.altasoft.com (HELO lap) (204.242.142.182) by smtp.ipoptical.com (209.228.32.64) with SMTP; 27 Jun 2000 15:28:38 -0700 X-Sent: 27 Jun 2000 22:28:38 GMT Message-ID: <005101bfe0a0$92f85280$b68ef2cc@baces.com> From: "ben abarbanel" To: Subject: Please review my latest version of BGP-4 TE draft Date: Tue, 27 Jun 2000 18:31:23 -0700 MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_NextPart_000_004D_01BFE065.E5516A20" Sender: owner-idr@merit.edu Precedence: bulk This is a multi-part message in MIME format. ------=_NextPart_000_004D_01BFE065.E5516A20 Content-Type: multipart/alternative; boundary="----=_NextPart_001_004E_01BFE065.E5516A20" ------=_NextPart_001_004E_01BFE065.E5516A20 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable The draft is attched to this message. Please review it and give me your = comments. It has also been submitted to IETF. Thank You, Ben ------=_NextPart_001_004E_01BFE065.E5516A20 Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable
The draft is attched to this message. = Please review=20 it and give me your comments. It has also been submitted to = IETF.
 
Thank You,
Ben
------=_NextPart_001_004E_01BFE065.E5516A20-- ------=_NextPart_000_004D_01BFE065.E5516A20 Content-Type: text/plain; name="bgp_te_draft_06_27.txt" Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename="bgp_te_draft_06_27.txt" Network Working Group Ben Abarbanel Internet Draft IPOptical, = Inc., Document: draft-ietf-abarbanel-bgp4-te-01.txt Senthil = Venkatachalam Expiration Date: September, 2000 Alcatel, U.S.A BGP-4 support for Traffic Engineering Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026 []. This document is an Internet-Draft and is in full conformance with = all provisions of Section 10 of RFC2026 except that the right to = produce derivative works is not granted. This document is an Internet-Draft and is NOT offered in accordance with Section 10 of RFC2026, and the author does not provide the = IETF with any rights other than to publish as an Internet-Draft Internet-Drafts are working documents of the Internet Engineering = Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six = months and may be updated, replaced, or obsoleted by other documents at = any time. It is inappropriate to use Internet- Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. Abstract Currently, constraint routing (CR) and traffic engineering (TE) models do not take into consideration the big picture view of IP traffic traversing multiple autonomous systems (AS). Most of the traffic and constraint routing is based on IGP protocols such as OSPF/ISIS, etc. The resulting view of the Internet is limited to=20 one autonomous system and areas or systems within it. Hence, the=20 routing/forwarding functions do not select the optimum path for=20 packets that need to traverse several autonomous systems.=20 =20 The proposal in this draft is that the BGP protocol can be utilized = to choose the best BGP routes based on traffic engineered (TE)=20 constraint weights. This information can be propagated between all=20 BGP peers and calculated by the BGP AS border routers before it is=20 deployed to their forwarding tables. draft-ietf-abarbanel-bgp4-te.txt [page = 2] 1. Conventions used in this document The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in = this document are to be interpreted as described in RFC-2119 []. 2. Overview The Internet is composed of many autonomous systems(AS). BGP acts = as the road map for selecting the best paths across these systems. = Today's Internet infrastructure does not consider the Internet as one homogenous system but a collection of systems. Where each system is managed autonomously by an (ISP) provider. These providers tend to filter or block routes from each other. This in itself creates a non-optimum routing. Constraint based routing adds a level of = traffic influence by defining the path (all the hops along the way) that a packet should take. It does this very well within an autonomous = system and within the limitation of an OSPF/ISIS areas/levels. The area border routers inside the AS will need to compute the best path to other areas until the packet arrives at its destination within the autonomous system or leaves the AS to the next Autonomous System. = The choice of which AS to select is based purely on a one dimensional level, by looking at the minimum number of AS hops to a given destination or providing various weights and preferences, that are local to the AS. The BGP Multi Exit Discriminator (MED) provides some level of inter-AS metric but is still limited to the adjacent AS. The approach discussed here is to view the Internet as one = homogenous entity with many AS's. Each of the AS's can have a summary weight assigned to it based on a given traffic engineering criteria. This weight can be a type of quality of service, such as maximum IGP bandwidth available, maximum number of IGP hops, maximum IGP delay across the AS from one IBGP router to the other IBGP router, = maximum bandwidth for a service class A, B, C, etc. These criteria levels = are summarized by the IGP for a given network destination in a given AS and exported to BGP. The summarization within an AS can be achieved using IGP constraint based routing algorithms used within link = state protocols such as OSPF and ISIS. In the case of OSPF, a new opaque = LSA is defined to propagate the summarized weights between OSPF areas. This draft is organized as follows. Section 3 details the approach = at=20 the BGP level between ASs and Section 4 discusses the approach at = the IGP level within an AS. The various traffic engineering weights are discussed in Section 5. The following sections deal with the=20 redistribution of the traffic engineering weight information from = IGP to=20 BGP, configuration issues, route flapping, and ISP issues. The BGP = and=20 IGP approaches together will provide a comprehensive traffic = engineering=20 routing system across the internet. =20 draft-ietf-abarbanel-bgp4-te.txt [page = 3] 3. BGP Level The method used to propagate constraint summarization weights for = each AS=20 is to define a new attribute which contains QOS like sub fields that = can=20 include such parameters as bandwidth, number of hops, delay, various = QOS=20 service classes, and so on. Historical NLRI information is contained as well. These sub fields will be termed TE weights and each BGP = router=20 would propagate this information to its peers. Making it possible for = any=20 router that handles the data to have a traffic engineering view of = all the=20 paths and their associated TE Weights for a given route to a given = router. =20 When the BGP RIB database is loaded with TE Weight information, a TE=20 capable BGP router would compute based on TE manual configurations=20 criteria the best BGP route for a given destination. The BGP Route=20 Selection process is extended to support a TE way of prioritizing the = best=20 routes for any configured destination. In which the order and = preference=20 of the routes can be changed to give the TE weight attribute a higher = priority than other attributes. Different TE Weight types could be=20 manually assigned a different level of priority in order to = strategize a=20 system where the BGP route selection criteria could be further = optimized. 3.1 BGP Routes Originating in Local AS Routes that originated within the same AS will not be calculated for TE weights since there is no BGP optimization that can be achieved. In this case, all TE Weight calculations are strictly done at the = IGP level. As a rule, IBGP routers will not propagate to other IBGP = routers =20 the new TE Weight attribute for routes that originated in the same = AS.=20 Routes that came from other EBGP peers could contain the new TE = Weight=20 attribute which will be propagated to IBGP peers. 3.2 Route Reflector Functionality=20 When a Route Reflector receives an update messages with the = Aggregated TE=20 Weight attribute it will simply add it to the BGP RIB-IN/OUT for = local =20 use and propagate it to other IBGP peers without modifying any of = its=20 fields. =20 3.3 When to Add TE Weight to BGP Update messages When an IBGP router receives an update message from another IBGP = router=20 for a (route) destination outside its own AS, it will consult the = local=20 FIB database for its reachability via the corresponding update = message=20 BGP Next Hop. Using this BGP Next Hop it will check the associated = TE=20 summary weight provided in the FIB by the extended IGP constraint = routing=20 calculations. It will add or compare the IGP TE summary weight with = the=20 Aggregated TE Weight that was propagated in the update message and = create=20 the next level Aggregated TE weight. Next, it will add the = Aggregated TE=20 weight to the local BGP RIB and use it for the BGP route selection=20 process. In addition, the current router will propagate the new=20 Aggregated TE weight value to all of its EBGP peers except the one = that=20 introduced the route. =20 draft-ietf-abarbanel-bgp4-te.txt = [page 4] Propagation, aggregating BGP TE Weights, and deploying into the BGP = RIB =20 and FIB tables is performed from one router to the next such that = the=20 only TE weight that is detected by any router along an AS path is=20 accurate enough to cover the TE measurements from a given router to = the=20 route's point of origin. 3.4 Route Aggregation Impacts As routes are propagated from one AS to the next, it is possible = that providers have configured their routers to aggregate more specific = routes=20 into general ones. This condition will cause the specific routes and = their TE Weight attributes to get lost or reduced in accuracy. In = order=20 to keep track of this information, it is proposed that the specific=20 routes and their TE Weights be contained in the new TE Weight = attribute=20 fields. It would be possible at any router along the route = propagation=20 paths to determine the detail accuracy of the TE weights that made = up the=20 super aggregate TE Weight.=20 Example 1: A single AS-Path with no route aggregation, and TE = weights=20 are aggregated. In AS1 In AS2 Arriving in Dest AS3 N1N2,AS1,TE1 ----> N1N2,AS2,(TE1+TE2) ----> N1N2,TE12 =20 In AS3 all we need is the routes(N1N2) and the TE weight TE12, which = is =20 the aggregate of TE1 and TE2. Example 2: Aggregating routes from multiple AS-Paths, and TE weights are super aggregated too. In AS1 In AS2 In Dest AS3 N1N2,AS1,TE1 ----> N1N2,AS2,(TE1+TE2) --> N1N2,TE12 --> see below=20 In AS4 In AS5 In Dest AS3 N3N4,AS5,TE5 ----> N3N4,AS5,(TE5+TE6) --> N3N4,TE56 --> see below At this point in AS3, routes N1N2 and N3N4 are aggregated together to form super aggregate route N13N24,=20 with the aggregated TE weight of (TE12+TE56) =3D Super aggregated TE = WEIGHT=20 TE1256. In order to keep track of which routes made up TE12 and TE56 we keep = two TE Weight paths in the super aggregated route N13N24 as follows: N13N24/TE1256 =3D (N1N2/TE12, N3N4/TE56) in the TE Weight Attribute = field. =20 This list of historical routes can be as large as the number of = route=20 aggregation points that are performed along the AS-Paths. *Note: For each list of historical routes and TE weights, there will = be=20 one or several TE Weight types. See section 3.5 for details. =20 draft-ietf-abarbanel-bgp4-te.txt [page = 5] 3.5 TE Weight Calculation Impacts As was seen in section 3.4, routes and TE Weights are aggregated = together=20 into less specific information and as a result the TE weights become = more=20 and more global. This affect creates sub optimal data for choosing = the=20 best path across a series of ASs. If there is an historical trace of = route and TE weights that created the global aggregated route, than = its=20 possible to make more accurate decision in choosing the best BGP = route.=20 This is done during the route selection process which is run for the = same=20 route from multiple BGP peers. With the information in the TE Weight = Attribute its possible to look inside an aggregated route and more=20 precisely see its composition (sub-aggregation of routes and more=20 specific TE weights) and determine the best TE route and related BGP = Next=20 Hop. =20 3.6 TE Weight Attribute Format=20 The BGP update message will contain a new Optional Transitive = attribute=20 called TE Weight (type code ?). This contains one or several = aggregated=20 TE Weight types and their historical sub aggregated routes and TE=20 weights. The amount of aggregation would depend on the number of = AS's the=20 route had traversed from the point of origin to the current router. =20 A TE Weight type must have the same consistency or granularity = throughout=20 all the routers that have the software to aggregate, calculate, and=20 propagate it further. =20 |-- Optional(1) or well known(0)=20 | |-- transitive(1) or non-transitive(0) | | |-Attr Flags=20 | | | 1 2 3 =20 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1=20 +-|-|-+-+-+-+-+|+-+-+-+-+-+-+-+|+-+-+-+-+-+-+-+|+-+-+-+-+-+-+-++ = =20 |1|1| FLAGS | Attr Type (?) | Number TE Weight Lists | +-+-+-+-+-+-+-+|+-+-+-+-+-+-+-+|+-+-+-+-+-+-+-+|+-+-+-+-+-+-+-++ = +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D|=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D|=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D|=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D+ + 1st Super Aggregated TE Weight list entry | = +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D|=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D|=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D|=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D+ | Number of TE | TE Weight | 1st Super Aggregated TE Weight| | Weight types | Type 1 | Value | +-+-+-+-+-+-+-+|+-+-+-+-+-+-+-+|+-+-+-+-+-+-+-+|+-+-+-+-+-+-+-++ | 2nd TE Weight | 2nd Super Aggregated TE Weight |nth TE Weight| | Type | Value |type | +-+-+-+-+-+-+-+|+-+-+-+-+-+-+-+|+-+-+-+-+-+-+-+|+-+-+-+-+-+-+-++ |nth Super Aggregated TE Weight| Number of Aggregated Route | | Value | Prefixes | +-+-+-+-+-+-+-+|+-+-+-+-+-+-+-+|+-+-+-+-+-+-+-+|+-+-+-+-+-+-+-++ |route prefix 1| 1st Aggregated IP Route prefix | |length | 1st, 2nd, 3rd byte | +-+-+-+-+-+-+-+|+-+-+-+-+-+-+-+|+-+-+-+-+-+-+-+|+-+-+-+-+-+-+-++=20 |1st route pref| route prefix n| nth Aggregated IP route prefix| | 4th byte | length | 1st, and 2nd byte | +-+-+-+-+-+-+-+|+-+-+-+-+-+-+-+|+-+-+-+-+-+-+-+|+-+-+-+-+-+-+-++ |nth Aggregated IP route prefix| | | 3rd and 4th byte | ... | +-+-+-+-+-+-+-+|+-+-+-+-+-+-+-+|+-+-+-+-+-+-+-+|+-+-+-+-+-+-+-++ draft-ietf-abarbanel-bgp4-te.txt [page = 6] 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1=20 +-+-+-+-+-+-+-+|+-+-+-+-+-+-+-+|+-+-+-+-+-+-+-+|+-+-+-+-+-+-+-++ = =20 = +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D|=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D|=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D|=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D+ + nth Super Aggregated TE Weight list entry | = +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D|=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D|=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D|=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D+ | Number of TE | TE Weight | 1st Super Aggregated TE Weight| | Weight types | Type 1 | Value | +-+-+-+-+-+-+-+|+-+-+-+-+-+-+-+|+-+-+-+-+-+-+-+|+-+-+-+-+-+-+-++ | 2nd TE Weight| 2nd Super Aggregated TE Weight |nth TE Weight | | Type | Value |type | +-+-+-+-+-+-+-+|+-+-+-+-+-+-+-+|+-+-+-+-+-+-+-+|+-+-+-+-+-+-+-++ |nth Super Aggregated TE Weight| Number of Aggregated Route | | Value | Prefixes | +-+-+-+-+-+-+-+|+-+-+-+-+-+-+-+|+-+-+-+-+-+-+-+|+-+-+-+-+-+-+-++ |route prefix 1| 1st Aggregated IP Route prefix | |length | 1st, 2nd, 3rd byte | +-+-+-+-+-+-+-+|+-+-+-+-+-+-+-+|+-+-+-+-+-+-+-+|+-+-+-+-+-+-+-++=20 |1st route pref| route prefix n| nth Aggregated IP route prefix| | 4th byte | length | 1st, and 2nd byte | +-+-+-+-+-+-+-+|+-+-+-+-+-+-+-+|+-+-+-+-+-+-+-+|+-+-+-+-+-+-+-++ |nth Aggregated IP route prefix| | | 3rd and 4th byte | ... | +-+-+-+-+-+-+-+|+-+-+-+-+-+-+-+|+-+-+-+-+-+-+-+|+-+-+-+-+-+-+-++ - Each TE Weight type could be: o Maximum Bandwidth Available o Maximum Number of IGP Hops o Maximum Transit Delay o Color o Etc. - Each TE Weight value can be in units relevant to its use: o For maximum bandwidth could be Megabits/second. o For Maximum transit delay could be in milliseconds. - Aggregated IP Route Prefix: Defines the aggregation of several=20 routes into a single route for a given AS. =20 - Route Prefix Length: The number of significant bits from the =20 left side or high order byte of the IP address. As defined in = CIDR. Example: IP address 10.20.30.40, High order Byte is the 10. =20 - Super Aggregated TE Weight: Each TE weight will have a = different=20 algorithm for aggregation and meaning. See section 5.0 for = details. =20 3.7 Phasing TE Weights=20 The use of TE weights can be manually configured on a destination = route=20 prefix basis, and thus its possible to enable/disable TE Weights on = any=20 BGP route. The functionality can be easily phased into an AS where = some BGP routers are TE Weight capable and others are not. The new = attribute=20 will be Optional and Transitive which implies that older BGP = routers =20 will be required to propagate the new attribute in update messages = to=20 their peers.=20 draft-ietf-abarbanel-bgp4-te.txt [page 7] When not all of the BGP routers along a given TE Path support the TE Weight attribute, the TE weight calculations is not done there and = the best BGP route selection computation will be less than optimal. =20 4. IGP TE Constraints based Calculations and AS Summarization for TE The IGP level routing protocol used is generally either OSPF or = ISIS. The basic OSPF and ISIS protocols allow routing based on a static metric. These protocols can be extended to take into consideration various traffic engineering criteria such as unreserved bandwidth, delay, colors, OSPF metric, etc. The approach detailed below is oriented more toward OSPF, but will be readily extended to ISIS. If the IGP inside a routing domain is OSPF, the AS is divided into areas - each of which is a collection of routers and networks. The routers within an area know the complete topology of the area by = means of LSA database synchronization with their neighbors at all times. Through the use of the opaque LSAs [8], it is possible to maintain = the traffic engineering topology (in addition to the regular network topology) and hence perform route calculations based on traffic engineering criteria within an area inside an AS that runs OSPF. However, this restricts the traffic engineering calculations to = within an area. To determine the traffic engineering weight to any network or router in the AS, the approach proposed for OSPF is to define a new opaque LSA for traffic engineering that summarizes the traffic engineering weights of every network from the perspective of an area border = router (ABR), for each area in the AS. This traffic engineering summary LSA is analogous to the summary LSA in OSPF. More details on this work = can be found in [9]. The destination network could either lie in the same area as the = ASBR, or in a different area. If the destination network and the ASBR are = in the same area, the opaque traffic engineering LSAs as defined in [8] for the area will suffice to calculate the traffic engineering = metric to the destination network. If the destination network lies in a different area than the ASBR, the weight is a combination (such as simple addition, or max) of the weight to the ABR and the weight described in the traffic engineering summary LSA originated by the = ABR that contains the destination network. In this manner, the traffic engineering weights for all the networks in an AS can be computed. Hence the ASBRs will also be able to determine the aggregate traffic engineering weights across the AS to other ASBRs and use this in the BGP advertisements. 5. Weights The traffic engineering weights act as a cost or distance function, describing the quality of a path to a destination network=20 in traffic engineering terms. The traffic engineering weights=20 currently proposed are: =20 draft-ietf-abarbanel-bgp4-te.txt [page = 8] 1. Available Bandwidth (in bits/sec) 2. Unreserved Bandwidth (in each of several classes) 3. Colors (or class types) (8 types) 4. Transit Delay (in milliseconds) 5. IGP Metric 6. IGP Hops These traffic engineering weights can be dynamically measured or statically configured for each interface in a node. The weights can=20 be propagated within an IGP area using the opaque traffic = engineering=20 LSA. The area border routers (ABRs) can summarize the traffic = engineering=20 weights to destination networks in their area and flood this = information=20 into the backbone through the use of the traffic engineering summary = LSA=20 as proposed in [9]. The summary LSAs can then be flooded by other = ABRs=20 into their areas. Any ASBR can determine the TE weight to a destination network by=20 examining its TE database and performing a simple dijkstra like=20 calculation by its associated IGP. This calculation can be a real=20 dijkstra if the weight is additive (delay, or hops or IGP metric),=20 or dijkstra like if the weight is a minimum to maximum constraint =20 (available bandwidth, unreserved bandwidth, and colors).=20 Depending on the level of processing desired, the number of = supported TE=20 weights should be configured. These calculations result in a value for each TE Weight for each=20 destination network in the AS and for each ASBR across the AS. The weights from the IGP are then exported into BGP for propagation = to =20 its EBGP peers.=20 The following two subsections deal specifically with the treatment = of the available bandwidth and delay TE weights, during the calculations: Maximum Available Bandwidth:=20 IBGP/IGP router within a given AS will pick the smallest available = Bandwidth link from the ingress to the egress of the IBGP to IBGP = path=20 and use that value as the summary TE weight. Next the egress IGP = will=20 export this information to BGP which will use and propagate it to = its=20 EBGP peers as the first aggregated TE Weight value. As the route = is=20 propagate to each IBGP peer in another AS, the Aggregated TE = weight=20 value in the message is compared with the summary TE weight for = the BGP =20 Next Hop router defined in the message. Whichever value is the = smaller =20 of the two, will be considered the next level of aggregation value = and =20 propagated further to the next AS. =20 This mechanism will continue till the route reaches the furthest = points=20 in the Internet. This way the smallest available bandwidth value = will=20 be used as the overall Aggregated TE Weight value for any given = route=20 along the TE path. =20 =20 Maximum Delay:=20 IBGP/IGP router within a given AS will obtain the delay from the = =20 ingress to the egress of the IBGP to IBGP path and use that value = as =20 the summary TE "delay" weight. Next the egress IGP will export = this =20 information to BGP draft-ietf-abarbanel-bgp4-te.txt [page = 9] =20 which will use and propagate it to its EBGP peers as the first = aggregated=20 TE Weight "delay" value. As the route is propagate to each IBGP peer = in=20 another AS, the Aggregated TE weight value in the message is added = with=20 the summary TE "delay" weight for the BGP Next Hop router defined in = the message. The sum of the two is considered the next level of = aggregation=20 and propagated further to the next AS. This mechanism will continue = till the =20 route reaches the furthest points in the Internet. =20 =20 *Note: See section 3.4 for Route Aggregation Impacts which must be = =20 followed in order for this mechanism to be optimal. As = mentioned =20 in section 3.4, historical routes/TE Weights are carried in = this =20 new attribute for routes that are aggregated several times. 6. TE Weights Redistributed from the IGP to BGP In order for a given router to inherit the TE weights from IGP, a mechanism must be provided to do that. What is proposed is that the standard FDB database contain TE summary weight fields based in each route entry, with their TE weight type, TE weight value and associated AS number. In other words, in all the BGP routers, the running IGP will inject the TE summary weight information into the = FDB database and signal BGP to import it to its BGP RIB database for propagation to other BGP peers. 7. Configuring the TE feature for BGP and IGP. In order to minimize the amount of information produced by the TE = weight =20 parameters across the entire Internet as described in section 3.6, it = is =20 recommended that the TE weight be configured on at the network = route's =20 point of origin. Implying that the AS router that originated the = route =20 into BGP will add the TE Weight attribute and all transit routers = that =20 propagate this route will perform the TE Weight processing when they = see =20 the TE Weight Attribute. This way it will be possible to limit the = amount=20 of TE Weight information being propagated across the Internet. =20 What is needed to manually configure the BGP Best Route Selection = =20 criteria: I. To Enable TE Weight Feature for a given network prefix. a. Define each network prefix for TE Weight functionality. Once enabled all transit BGP TE routers will process the TE = Weight aggregation and propagate the TE Weight attribute to their = peers. II. To manually configure the BGP Best Route Selection Criteria in = each BGP TE Router, configure the following parameters: a. TE Weight Type o Maximum Bandwidth Available (Megabits/sec) o Unreserved Bandwidth (in each of several classes) o Maximum Number of IGP Hops o IGP Metric o Maximum Transit Delay (in Milliseconds) o Color (or class types) (8 types) =0D draft-ietf-abarbanel-bgp4-te.txt [page = 10] =20 b. TE Weight Priority 1st choice =3D Make this TE Weight the first choice =20 for best BGP route selection criteria 2nd choice =3D Make this TE Weight the 2nd choice for best BGP route selection criteria nth choice =3D Make this TE Weight the 2nd choice for best BGP route selection criteria c. Example of BGP TE Topology AS2 +---------+ 70 MB +--------------+ | R6+------------------------------------| R7--net1--R13| | AS5 ||70MB |/ \ 80MB| | | R5| / \ | | |---------+ /| \30MB | | \ 30MB / | \------R8| \70MB / +------------+-+ \ AS1 / | +-------+--------|----------/+ |30MB | R2 | 30MB | R3|\ +------+-------+ | | /-|--------|---------/ | \ | R9-net2 | | | / | | | \60MB | 60MB / | Source--+->R1/ | Area 0 | Area 2 | \ | / | | \----+--------+--------\ | \ | / | | Area 1| 30MB | R4 | \---+-R10 AS3 | +-------+--------|-----------+ +-- -\---------+ \ \ \30MB \30MB \ \=20 \+----------- --\----+ |R11----------- R12 | | 30MB / | | / | | AS4 net3 | +-------------------+ Figure 1. BGP TE Topology draft-ietf-abarbanel-bgp4-te.txt = [page 11] Table 1. Router R1 BGP TE Weight Table +-+-+-+-+-+-+-+-+-+|+-+-+-+-+-+-+-+++-+-+-+-+-+-+-+-+-+-+-+-++ |Entry # | Route | BGP Next Hop | Aggregated TE Weight | | | | | Bandwidth | +-+-+-+-+-+-+-+-+-+|+-+-+-+-+-+-+-+++-+-+-+-+-+-+-+-+-+-+-+-++ | 1 | NET1 | R7 via R3 | 30 MB | +-+-+-+-+-+-+-+-+-+|+-+-+-+-+-+-+-+|+-+-+-+-+-+-+-+-+-+-+-+-+- | 2 | *NET1 | R5 via R2 | 70 MB | +-+-+-+-+-+-+-+-+-+|+-+-+-+-+-+-+-+|+-+-+-+-+-+-+-+-+-+-+-+-+- | 3 | NET1 | R10 via R3 | 60 MB | +-+-+-+-+-+-+-+-+-+|+-+-+-+-+-+-+-+|+-+-+-+-+-+-+-+-+-+-+-+-+- | 4 | NET1 | R11 via R4 | 30 MB | +-+-+-+-+-+-+-+-+-+|+-+-+-+-+-+-+-+|+-+-+-+-+-+-+-+-+-+-+-+-+- | 5 | NET2 | R11 via R4 | 30 MB | +-+-+-+-+-+-+-+-+-+|+-+-+-+-+-+-+-+|+-+-+-+-+-+-+-+-+-+-+-+-+- | 6 | *NET2 | R10 via R3 | 60 MB | +-+-+-+-+-+-+-+-+-+|+-+-+-+-+-+-+-+|+-+-+-+-+-+-+-+-+-+-+-+-+- | 7 | NET2 | R7 via R3 | 30 MB | +-+-+-+-+-+-+-+-+-+|+-+-+-+-+-+-+-+|+-+-+-+-+-+-+-+-+-+-+-+-+- | 8 | NET2 | R5 via R2 | 30 MB | +-+-+-+-+-+-+-+-+-+|+-+-+-+-+-+-+-+|+-+-+-+-+-+-+-+-+-+-+-+-+- =20 *Note: Although we show only one TE weight type under each AS = subfield it is possible to have multiple TE weight types and have them prioritized by manual configurations in each BGP TE = router.=20 In Figure 1 we see that R1 is the current router receiving Update messages from R2, R3, and R4. All the Update message contain the new BGP TE Weight Attribute with the weights as shown in Table 1. From this data, we see that for NET1, Entry #2 is the best choice amongst other NET1 entries, because it guarantees that we get 70MB bandwidth from AS2 and AS5. Assumption, operator had configured the = TE (bandwidth) weight as the highest priority criteria. Also for NET2 entries, entry #6 is the best choice, even though it includes more AS hops, but it guarantees that at least 60MB of bandwidth will be available. This case shows, how sometimes sacrificing more AS hops for guaranteed bandwidth service could work. 8. Route Flapping and Impacts on BGP Aggregated TE Weights In the real world route flapping is a normal occurrence. It is expected that the router performing TE Weight Aggregation will be = =20 notified by the IGP that routes have gone down/up and it will be=20 required to recalculate the aggregated TE Weight. Obviously, when a = new=20 aggregated TE weight is defined for an aggregated route, the router=20 performing this calculation will be required to deploy it to its BGP = peers. draft-ietf-abarbanel-bgp4-te.txt [page = 12] 9. BGP Confederations Since the IGP domain will cover the entire AS, any mapping of BGP = =20 Confederation within each of the sub-ASs will not be used for BGP TE = =20 aggregation. Internal Sub-AS BGP routers running either EBGP or IBGP will not =20 aggregate the new TE Weight attribute since that job will only be = done=20 By the BGP ASBR routers. They will however, use these attributes in = their Route selection criteria and propagate them further to all = inter/intra sub-=20 AS peers. =20 10. ISP Issues and Dependencies TBD 11. Conclusion As has been shown, the Internet is a complex WEB of systems within=20 systems. Traffic Engineering is the only solution that attempts to=20 control/contain the explosion of resources within routers. The = technique=20 described in this manuscript defines a way for BGP routers to = influence=20 and control the routing paths chosen for routes that are bound by=20 Traffic Engineering Weights. Thereby, enabling the Inter-AS = providor a=20 more uniform method of allocating resources and providing = gauranteed=20 service using BGP TE across multiple Autonomous Systems.=20 =20 12. Security Considerations The new BGP TE Weight attribute does not change the underlying = security=20 issues inherent in the existing BGP-4 design. 13. References [1] "A Border Gateway Protocol 4 (BGP-4)", Y. Rekhter & T. Li, RFC1771, March 1995 [2] Awduche, D., Malcolm, J., Agogbua, J., O'Dell, M., McManus, J., "Requirements for Traffic Engineering Over MPLS", RFC 2702, = September 1999. [3] Awduche, D., Rekhter, Y., Drake, J., Coltun, R., "Multi-Protocol Lambda Switching: Combining MPLS Traffic Engineering Control = With Optical Crossconnects", draft-awduche-mpls-te-optical-01.txt. [4] Jamoussi, B. "Constraint-Based LSP Setup using LDP", Work in Progress, Internet Draft , = September 1999. [5] Braden, R., Zhang, L., Berson, S., Herzog, S., "Resource ReSerVation Protocol (RSVP) -- Version 1 Functional = Specification",=20 RFC 2205, September 1997. [6] Awduche, D. et al "Extensions to RSVP for LSP Tunnels", Work in Progress, Internet Draft = September 2000. [8] Katz, D. and Yeung D., "Traffic Engineering Extensions to OSPF", Internet Draft [9] Venkatachalam, Senthil and Abarbanel, B., "Traffic Engineering Summary Extensions to OSPF", work in progress. [10] Villamizar, "BGP Route Flap Damping", RFC2439, November 1998 14.Acknowledgments To be supplied in future revisions. =20 15. Author's Addresses Ben Abarbanel IPOptical, Inc. 11480 Sunset Hills Rd, Suite 200E Reston, VA 20190 email: ben.abarbanel@ipoptical.com Senthil Venkatachalam Alcatel USA 45195 Business Court, Suite 400 Dulles, VA 20166 email: senthil.venkatachalam@usa.alcatel.com ------=_NextPart_000_004D_01BFE065.E5516A20-- Received: from segue.merit.edu (segue.merit.edu [198.108.1.41]) by nic.merit.edu (8.9.3/8.9.1) with ESMTP id NAA29105 for ; Fri, 16 Jun 2000 13:03:40 -0400 (EDT) Received: by segue.merit.edu (Postfix) id 4CBB95DDA5; Fri, 16 Jun 2000 13:03:15 -0400 (EDT) Delivered-To: idr-outgoing@merit.edu Received: by segue.merit.edu (Postfix, from userid 56) id 396145DDC0; Fri, 16 Jun 2000 13:03:15 -0400 (EDT) Received: from mailrelay00.aa.ops.us.uu.net (postal.aa.ops.us.uu.net [147.225.22.28]) by segue.merit.edu (Postfix) with ESMTP id DFFA25DDA5 for ; Fri, 16 Jun 2000 13:03:13 -0400 (EDT) Received: from flipper.cisco.com (flipper.cisco.com [171.69.63.10]) by mailrelay00.aa.ops.us.uu.net (8.9.3/8.9.3) with ESMTP id NAA18642 for ; Fri, 16 Jun 2000 13:03:13 -0400 (EDT) Received: (rsrihari@localhost) by flipper.cisco.com (8.8.5-Cisco.2-SunOS.5.5.1.sun4/8.6.5) id KAA00807; Fri, 16 Jun 2000 10:02:04 -0700 (PDT) From: Srihari Ramachandra Message-Id: <200006161702.KAA00807@flipper.cisco.com> Subject: Re: capabilities option To: BRijsman@unispheresolutions.com (Rijsman, Bruno) Date: Fri, 16 Jun 2000 10:02:03 -0700 (PDT) Cc: patrick.mensch@alcatel.be ('patrick.mensch@alcatel.be'), idrp@merit.edu ('IDRP exploder'), bgp@ans.net ('BGP exploder') In-Reply-To: <49FF5C6DDBD8D311BBBD009027DE980C073843@uniwest1.redstonecom.com> from "Rijsman, Bruno" at Jun 16, 2000 10:03:25 AM X-Mailer: ELM [version 2.5 PL1] MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-idr@merit.edu Precedence: bulk > > > o capability code = 0x80 = RFC2434 - "Private Use" > > (Cisco proprietary ?) > > I believe Cisco uses 0x80 to advertise the route refresh capability > (draft-ietf-idr-bgp-route-refresh-01 proposes to use value 2 for this, but I > guess Cisco picked a private value while waiting for a standard value to be > allocated). True, some older implementations use only 0x80 as the capability code. The later images (and 12.0ST) have the new value also. srihari... Received: from segue.merit.edu (segue.merit.edu [198.108.1.41]) by nic.merit.edu (8.9.3/8.9.1) with ESMTP id KAA26265 for ; Fri, 16 Jun 2000 10:02:16 -0400 (EDT) Received: by segue.merit.edu (Postfix) id 4598C5DE85; Fri, 16 Jun 2000 10:00:20 -0400 (EDT) Delivered-To: idr-outgoing@merit.edu Received: by segue.merit.edu (Postfix, from userid 56) id 33B975DE84; Fri, 16 Jun 2000 10:00:20 -0400 (EDT) Received: from mailrelay00.aa.ops.us.uu.net (postal.aa.ops.us.uu.net [147.225.22.28]) by segue.merit.edu (Postfix) with ESMTP id 54D6F5DE81 for ; Fri, 16 Jun 2000 10:00:11 -0400 (EDT) Received: from uniwest1.redstonecom.com ([199.105.223.130]) by mailrelay00.aa.ops.us.uu.net (8.9.3/8.9.3) with ESMTP id KAA08827 for ; Fri, 16 Jun 2000 10:00:10 -0400 (EDT) Received: by uniwest1.redstonecom.com with Internet Mail Service (5.5.2650.21) id ; Fri, 16 Jun 2000 10:03:28 -0400 Message-ID: <49FF5C6DDBD8D311BBBD009027DE980C073843@uniwest1.redstonecom.com> From: "Rijsman, Bruno" To: "'patrick.mensch@alcatel.be'" Cc: "'IDRP exploder'" , "'BGP exploder'" Subject: RE: capabilities option Date: Fri, 16 Jun 2000 10:03:25 -0400 MIME-Version: 1.0 X-Mailer: Internet Mail Service (5.5.2650.21) Content-Type: text/plain; charset="iso-8859-1" Sender: owner-idr@merit.edu Precedence: bulk > o capability code = 0x80 = RFC2434 - "Private Use" > (Cisco proprietary ?) I believe Cisco uses 0x80 to advertise the route refresh capability (draft-ietf-idr-bgp-route-refresh-01 proposes to use value 2 for this, but I guess Cisco picked a private value while waiting for a standard value to be allocated). -- Bruno. -----Original Message----- From: Patrick Mensch [mailto:menschp@sebb.bel.alcatel.be] Sent: Friday, June 16, 2000 4:53 AM To: Natale, John; bgp@ans.net Subject: Re: capabilities option "Natale, John" wrote: > > > What does this open message mean (from a Cisco)? > > > > [rest of open msg pkt...(BGP ID)]OC O2 06 01 04 00 01 00 01 02 02 80 > > 00 > > > > I got as far as: > > > > 0C = option parameter length = 12 > > 02 = option parameter type = capability > > > > 06 01 04 00 01 00 01 02 02 80 00 = option parameter value I think you should interpret it as : o optional parameter type = 0x02 = BGP capabilities o optional parameter length = 0x06 o capability code = 0x01 = Multiprotocol Extension Capability o capability length = 0x04 o capability value is * AFI (16-bit) = 0x0001 = IPv4 * Reserved (8-bit) = 0x00 * SAFI (8-bit) = 0x01 = Unicast Forwarding Capability o optional parameter type = 0x02 = BGP capabilities o optional parameter length = 0x02 o capability code = 0x80 = RFC2434 - "Private Use" (Cisco proprietary ?) o capability length = 0x00 > > capability#1 code is 6 > > capability#1 length is 1 > > capability#1 value is 4 > > > > capability21 code is 0 > > capability#2 length is 1 > > capability#2 value is 0 > > > > capability#3 code is 1 > > capability#3 length is 2 > > capability#3 value is 02 08 > > > > capability#4 code is 0 > > ??? > > ??? > > > > Per "draft-ietf-idr-bgp4-cap-neg-06.txt", I looked at RFC 2434 to > > determine what these codes are. RFC 2434 refers to the "the IETF consensus > > process" > > > > thank you in advance Patrick. Received: from segue.merit.edu (segue.merit.edu [198.108.1.41]) by nic.merit.edu (8.9.3/8.9.1) with ESMTP id EAA22023 for ; Fri, 16 Jun 2000 04:56:54 -0400 (EDT) Received: by segue.merit.edu (Postfix) id 77DF25DE98; Fri, 16 Jun 2000 04:56:30 -0400 (EDT) Delivered-To: idr-outgoing@merit.edu Received: by segue.merit.edu (Postfix, from userid 56) id 568CF5DE99; Fri, 16 Jun 2000 04:56:30 -0400 (EDT) Received: from mailrelay00.aa.ops.us.uu.net (postal.aa.ops.us.uu.net [147.225.22.28]) by segue.merit.edu (Postfix) with ESMTP id 46C9A5DE98 for ; Fri, 16 Jun 2000 04:56:28 -0400 (EDT) Received: from relay1.alcatel.be (alc119.alcatel.be [195.207.101.119]) by mailrelay00.aa.ops.us.uu.net (8.9.3/8.9.3) with ESMTP id EAA28501 for ; Fri, 16 Jun 2000 04:56:27 -0400 (EDT) Received: from btmp80.sebb.bel.alcatel.be (localhost [127.0.0.1]) by relay1.alcatel.be (8.10.1/8.10.1) with ESMTP id e5G8tn929235; Fri, 16 Jun 2000 10:55:50 +0200 (MET DST) Received: from sebb.bel.alcatel.be (btk07q [138.203.187.143]) by btmp80.sebb.bel.alcatel.be (8.8.8+Sun/8.8.8) with ESMTP id KAA26771; Fri, 16 Jun 2000 10:55:44 +0200 (MET DST) Message-ID: <3949EB08.866E71@sebb.bel.alcatel.be> Date: Fri, 16 Jun 2000 10:53:28 +0200 From: Patrick Mensch Reply-To: patrick.mensch@alcatel.be X-Mailer: Mozilla 4.7 [en] (WinNT; I) X-Accept-Language: en MIME-Version: 1.0 To: "Natale, John" , "bgp@ans.net" Subject: Re: capabilities option References: Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-idr@merit.edu Precedence: bulk "Natale, John" wrote: > > > What does this open message mean (from a Cisco)? > > > > [rest of open msg pkt...(BGP ID)]OC O2 06 01 04 00 01 00 01 02 02 80 > > 00 > > > > I got as far as: > > > > 0C = option parameter length = 12 > > 02 = option parameter type = capability > > > > 06 01 04 00 01 00 01 02 02 80 00 = option parameter value I think you should interpret it as : o optional parameter type = 0x02 = BGP capabilities o optional parameter length = 0x06 o capability code = 0x01 = Multiprotocol Extension Capability o capability length = 0x04 o capability value is * AFI (16-bit) = 0x0001 = IPv4 * Reserved (8-bit) = 0x00 * SAFI (8-bit) = 0x01 = Unicast Forwarding Capability o optional parameter type = 0x02 = BGP capabilities o optional parameter length = 0x02 o capability code = 0x80 = RFC2434 - "Private Use" (Cisco proprietary ?) o capability length = 0x00 > > capability#1 code is 6 > > capability#1 length is 1 > > capability#1 value is 4 > > > > capability21 code is 0 > > capability#2 length is 1 > > capability#2 value is 0 > > > > capability#3 code is 1 > > capability#3 length is 2 > > capability#3 value is 02 08 > > > > capability#4 code is 0 > > ??? > > ??? > > > > Per "draft-ietf-idr-bgp4-cap-neg-06.txt", I looked at RFC 2434 to > > determine what these codes are. RFC 2434 refers to the "the IETF consensus > > process" > > > > thank you in advance Patrick. Received: from segue.merit.edu (segue.merit.edu [198.108.1.41]) by nic.merit.edu (8.9.3/8.9.1) with ESMTP id VAA16808 for ; Thu, 15 Jun 2000 21:47:23 -0400 (EDT) Received: by segue.merit.edu (Postfix) id B235B5DD94; Thu, 15 Jun 2000 21:45:16 -0400 (EDT) Delivered-To: idr-outgoing@merit.edu Received: by segue.merit.edu (Postfix, from userid 56) id 9758B5DDB6; Thu, 15 Jun 2000 21:45:16 -0400 (EDT) Received: from mailrelay00.aa.ops.us.uu.net (postal.aa.ops.us.uu.net [147.225.22.28]) by segue.merit.edu (Postfix) with ESMTP id 4BDCE5DD94 for ; Thu, 15 Jun 2000 21:45:14 -0400 (EDT) Received: from nsa-mail.us.newbridge.com ([209.58.11.226]) by mailrelay00.aa.ops.us.uu.net (8.9.3/8.9.3) with ESMTP id VAA15731 for ; Thu, 15 Jun 2000 21:45:10 -0400 (EDT) Received: (from smtpd@localhost) by nsa-mail.us.newbridge.com (8.9.3/8.9.2) id VAA27598 for ; Thu, 15 Jun 2000 21:37:28 -0400 (EDT) Received: from nsa-gw1.us.newbridge.com(209.58.11.225), claiming to be "herndon-mh1.us.newbridge.com" via SMTP by nsa-mail.us.newbridge.com, id smtpdAAAa006jC; Thu Jun 15 21:37:27 2000 Received: from okemo.northc.com by herndon-mh1.us.newbridge.com with ESMTP for bgp@ans.net; Thu, 15 Jun 2000 21:44:42 -0400 Received: by okemo.northc.com with Internet Mail Service (5.5.2448.0) id ; Thu, 15 Jun 2000 21:45:38 -0400 Message-Id: From: "Natale, John" To: "'bgp@ans.net'" Subject: RE: capabilities option Date: Thu, 15 Jun 2000 21:41:07 -0400 X-Mailer: Internet Mail Service (5.5.2448.0) Sender: owner-idr@merit.edu Precedence: bulk > What does this open message mean (from a Cisco)? > > [rest of open msg pkt...(BGP ID)]OC O2 06 01 04 00 01 00 01 02 02 80 > 00 > > I got as far as: > > 0C = option parameter length = 12 > 02 = option parameter type = capability > > 06 01 04 00 01 00 01 02 02 80 00 = option parameter value > > capability#1 code is 6 > capability#1 length is 1 > capability#1 value is 4 > > capability21 code is 0 > capability#2 length is 1 > capability#2 value is 0 > > capability#3 code is 1 > capability#3 length is 2 > capability#3 value is 02 08 > > capability#4 code is 0 > ??? > ??? > > Per "draft-ietf-idr-bgp4-cap-neg-06.txt", I looked at RFC 2434 to > determine what these codes are. RFC 2434 refers to the "the IETF consensus > process" > > thank you in advance Received: from segue.merit.edu (segue.merit.edu [198.108.1.41]) by nic.merit.edu (8.9.3/8.9.1) with ESMTP id MAA24829 for ; Tue, 6 Jun 2000 12:50:01 -0400 (EDT) Received: by segue.merit.edu (Postfix) id F08F75DE37; Tue, 6 Jun 2000 12:47:51 -0400 (EDT) Delivered-To: idr-outgoing@merit.edu Received: by segue.merit.edu (Postfix, from userid 56) id DEE345DE36; Tue, 6 Jun 2000 12:47:51 -0400 (EDT) Received: from lava.cs.unh.edu (lava.cs.unh.edu [132.177.4.30]) by segue.merit.edu (Postfix) with ESMTP id AB9015DDF4 for ; Tue, 6 Jun 2000 12:47:49 -0400 (EDT) Received: from localhost (dmaftei@localhost) by lava.cs.unh.edu (8.9.0/8.9.0) with SMTP id MAA27889; Tue, 6 Jun 2000 12:46:16 -0400 (EDT) X-Authentication-Warning: lava.cs.unh.edu: dmaftei owned process doing -bs Date: Tue, 6 Jun 2000 12:46:16 -0400 (EDT) From: Danut C Maftei To: BJ Premore Cc: idr@merit.edu Subject: Re: inconsistent iBGP? In-Reply-To: <14653.5256.81008.988582@helvellyn.cs.dartmouth.edu> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-idr@merit.edu Precedence: bulk BJ, I don't think there's any contradiction. The case you're describing allows for load sharing from within your AS to a certain destination. As far as I know the spec does NOT require the use of a SINGLE exit point from an AS. If your border routers (your different NEXT_HOPs) are configured to assign the same degree of preference to an external destination, then I believe that the person in charge intended exactly this: to share the load between the two routers. BTW, I would recommend you use draft-ietf-idr-bgp4-10.txt for the BGP spec - RFC 1771 is somehow outdated. Let me know if this helps and/or if you have other questions. Regards, Dan C. Maftei dmaftei@cs.unh.edu On Tue, 6 Jun 2000, BJ Premore wrote: > I have a question about what appears to be a contradiction in RFC > 1771 with regard to providing a "consistent view of interior routing > within an AS". > > I'm working on a large-scale network simulation project taking place > in part at Dartmouth College (see ssfnet.org), and would like our > BGP-4 implementation to be "correct" according to the RFC. > > Anyway, I'm having trouble resolving what seems to be a contradiction > between section 3 and section 9.1.2.1. In section 3 (paragraph 3), it > states that a transit AS must present a consistent view of interior > routing. Section 9.1.2.1 explains tie-breaking for route selection. > If degree of preference and MED are the same, then step 'b' of the > algorithm states that lowest cost (interior distance) takes > precedence. But clearly, if there are two routes to the same > destination, each with the same DoP and MED but different NEXT_HOPs, > then based on interior distance to the NEXT_HOP, different BGP > speakers could select different routes to the same destination (since > one speaker might be closer to one NEXT_HOP than the other, and vice > versa). > > Maybe there's no contradiction because this is only allowed in a > non-transit AS, but I'd rather not have to guess about it. Could > anyone clarify and/or resolve this dilemma for me? > > Thanks, > > BJ > > > BJ Premore -- beej@cs.dartmouth.edu -- http://www.cs.dartmouth.edu/~beej/ > Grad Student, Computer Science Dept, Dartmouth College, Hanover, NH, USA > > > > > Received: from segue.merit.edu (segue.merit.edu [198.108.1.41]) by nic.merit.edu (8.9.3/8.9.1) with ESMTP id LAA23301 for ; Tue, 6 Jun 2000 11:11:38 -0400 (EDT) Received: by segue.merit.edu (Postfix) id 89F965DDE5; Tue, 6 Jun 2000 11:11:10 -0400 (EDT) Delivered-To: idr-outgoing@merit.edu Received: by segue.merit.edu (Postfix, from userid 56) id 754EC5DDEA; Tue, 6 Jun 2000 11:11:10 -0400 (EDT) Received: from helvellyn.cs.dartmouth.edu (helvellyn-mr.cs.dartmouth.edu [129.170.192.42]) by segue.merit.edu (Postfix) with ESMTP id F41165DDE5 for ; Tue, 6 Jun 2000 11:11:08 -0400 (EDT) Received: (from beej@localhost) by helvellyn.cs.dartmouth.edu (8.9.3/8.9.3) id LAA31428; Tue, 6 Jun 2000 11:11:05 -0400 (EDT) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <14653.5256.81008.988582@helvellyn.cs.dartmouth.edu> Date: Tue, 6 Jun 2000 11:11:04 -0400 (EDT) To: idr@merit.edu Subject: inconsistent iBGP? X-Mailer: VM 6.72 under 21.1 (patch 3) "Acadia" XEmacs Lucid From: BJ Premore Organization: Dartmouth College Department of Computer Science X-Attribution: bj X-Face: 52,edr17C-@5L=5^e0lwqXIJ/BH9#Rb-NI&gsw#MIyE2QjmrSAW4^pTeYLL9[8<;!id!'^3 H]z(%G#\8xII%@1+VA!V+aE"sOX",A*s@-J'Oe*b@!H1"nU}<{YSne$rU+y{a,/?CB^2SjcIdE$ABw N"$ipB!qcL0P|vu Sender: owner-idr@merit.edu Precedence: bulk I have a question about what appears to be a contradiction in RFC 1771 with regard to providing a "consistent view of interior routing within an AS". I'm working on a large-scale network simulation project taking place in part at Dartmouth College (see ssfnet.org), and would like our BGP-4 implementation to be "correct" according to the RFC. Anyway, I'm having trouble resolving what seems to be a contradiction between section 3 and section 9.1.2.1. In section 3 (paragraph 3), it states that a transit AS must present a consistent view of interior routing. Section 9.1.2.1 explains tie-breaking for route selection. If degree of preference and MED are the same, then step 'b' of the algorithm states that lowest cost (interior distance) takes precedence. But clearly, if there are two routes to the same destination, each with the same DoP and MED but different NEXT_HOPs, then based on interior distance to the NEXT_HOP, different BGP speakers could select different routes to the same destination (since one speaker might be closer to one NEXT_HOP than the other, and vice versa). Maybe there's no contradiction because this is only allowed in a non-transit AS, but I'd rather not have to guess about it. Could anyone clarify and/or resolve this dilemma for me? Thanks, BJ BJ Premore -- beej@cs.dartmouth.edu -- http://www.cs.dartmouth.edu/~beej/ Grad Student, Computer Science Dept, Dartmouth College, Hanover, NH, USA Received: from segue.merit.edu (segue.merit.edu [198.108.1.41]) by nic.merit.edu (8.9.3/8.9.1) with ESMTP id AAA20194 for ; Fri, 2 Jun 2000 00:32:08 -0400 (EDT) Received: by segue.merit.edu (Postfix) id 079425DDA1; Fri, 2 Jun 2000 00:31:15 -0400 (EDT) Delivered-To: idr-outgoing@merit.edu Received: by segue.merit.edu (Postfix, from userid 56) id E5C835DE0D; Fri, 2 Jun 2000 00:31:14 -0400 (EDT) Received: from mailrelay00.aa.ops.us.uu.net (postal.aa.ops.us.uu.net [147.225.22.28]) by segue.merit.edu (Postfix) with ESMTP id 81A9F5DDA1 for ; Fri, 2 Jun 2000 00:31:13 -0400 (EDT) Received: from daewoo.dti.daewoo.co.kr ([165.133.13.60]) by mailrelay00.aa.ops.us.uu.net (8.9.3/8.9.3) with ESMTP id AAA07899 for ; Fri, 2 Jun 2000 00:30:52 -0400 (EDT) Received: from param1 (seth [165.133.13.20]) by daewoo.dti.daewoo.co.kr (8.8.8+Sun/8.8.8) with SMTP id JAA24404 for ; Fri, 2 Jun 2000 09:55:00 -0600 (GMT) Message-ID: <005101bfcc4b$608adcf0$140d85a5@dti.daewoo.co.kr> From: "Naveen Seth" To: Subject: Internet - AS Date: Fri, 2 Jun 2000 10:00:58 +0530 MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_NextPart_000_004E_01BFCC79.73336380" X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 5.00.2314.1300 X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300 Sender: owner-idr@merit.edu Precedence: bulk This is a multi-part message in MIME format. ------=_NextPart_000_004E_01BFCC79.73336380 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Hi, In the contemporary internet we have NAPs (Network Access Point), = POPs(Point of Presence). 1. How do these map to the Autonomous Systems concept ? 2. In general, where will BGP be running and where will the various IGPs = be running? Thanks Naveen ------=_NextPart_000_004E_01BFCC79.73336380 Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable
Hi,
In the contemporary internet we have NAPs (Network = Access=20 Point), POPs(Point of Presence).
 
1. How do these map to the Autonomous Systems = concept=20 ?
 
2. In general, where will BGP be running and where = will the=20 various IGPs be running?
 
Thanks
Naveen
------=_NextPart_000_004E_01BFCC79.73336380-- Received: from segue.merit.edu (segue.merit.edu [198.108.1.41]) by nic.merit.edu (8.9.3/8.9.1) with ESMTP id OAA04722 for ; Thu, 1 Jun 2000 14:11:19 -0400 (EDT) Received: by segue.merit.edu (Postfix) id D25A25DDC8; Thu, 1 Jun 2000 14:10:52 -0400 (EDT) Delivered-To: idr-outgoing@merit.edu Received: by segue.merit.edu (Postfix, from userid 56) id B59B35DDF0; Thu, 1 Jun 2000 14:10:52 -0400 (EDT) Received: from mailrelay00.aa.ops.us.uu.net (postal.aa.ops.us.uu.net [147.225.22.28]) by segue.merit.edu (Postfix) with ESMTP id 0AA715DDC8 for ; Thu, 1 Jun 2000 14:10:51 -0400 (EDT) Received: from icarian.ZAFFIRE.COM (kahu.new-access.com [216.217.10.2] (may be forged)) by mailrelay00.aa.ops.us.uu.net (8.9.3/8.9.3) with ESMTP id OAA13675 for ; Thu, 1 Jun 2000 14:10:45 -0400 (EDT) Received: by ICARIAN with Internet Mail Service (5.5.2650.21) id ; Thu, 1 Jun 2000 11:15:58 -0700 Message-ID: <9A564CC874B5D3118FB9009027B0A6622D7D60@ICARIAN> From: Eric Gray To: "'Rijsman, Bruno'" , "'BGP exploder'" , "'IDRP exploder'" Cc: "'Eric Rosen'" , Eric Gray Subject: RE: Value for "label" in the NEXT-HOP field for BGP/MPLS VPNs Date: Thu, 1 Jun 2000 11:15:57 -0700 MIME-Version: 1.0 X-Mailer: Internet Mail Service (5.5.2650.21) Content-Type: text/plain; charset="iso-8859-1" Sender: owner-idr@merit.edu Precedence: bulk Bruno, I don't see why this is necessary. Also, I wonder if this should be discussed on the MPLS mailing list - either as well as, or in lieu of the BGP/IDRP mailing lists. Note that I did not add the list, but it's okay with me if you do. The next hop must be from the same address family, which is the AFI part of your comment. There is no requirement - that I can see - for it to be in the same SAFI. More-over, I know of no way to indicate to BGP that there is a SAFI associated with an address in the next hop field of a BGP update message. And the AFI would not be 1. -- Eric Gray > -----Original Message----- > From: Rijsman, Bruno [mailto:BRijsman@unispheresolutions.com] > Sent: Thursday, June 01, 2000 10:06 AM > To: 'BGP exploder'; 'IDRP exploder' > Subject: Value for "label" in the NEXT-HOP field for BGP/MPLS VPNs > > > I have a question about the following paragraph in > draft-rosen-rfc2547bis-01: > > When a PE router distributes a VPN-IPv4 route via BGP, it uses its > own address as the "BGP next hop". This address is encoded as a > VPN-IPv4 address with an RD of 0. ([BGP-MP] requires that the next > hop address be in the same address family as the NLRI.) It also > assigns and distributes an MPLS label. (Essentially, PE routers > distribute not VPN-IPv4 routes, but Labeled VPN-IPv4 routes. Cf. > [MPLS-BGP]). When the PE processes a received packet that has this > label at the top of the stack, the PE will pop the stack, > and process > the packet appropriately. > > Address family indicates that each NLRI in an > MP-REACH-NLRI or MP-UNREACH-NLRI attribute consists of three parts: > a) An MPLS label stack > b) A route distinguisher > c) An IPv4 prefix > > Since [BGP-MP] requires that the NEXT-HOP field in an MP-REACH-NLRI be > encoding in the same address family as the NLRI, I suppose it > follows that > the NEXT-HOP must also consist of: > a) An MPLS label stack > b) A route distinguisher > c) An IPv4 prefix > > draft-rosen-rfc2547bis-01 says we should use value 0 for the > RD (i.e. use > 0x0000000000000000, which is the encoded form of type=0, AS=0, > assigned_nr=0). > > However, draft-rosen-rfc2547bis-01 doesn't mention the value > we should use > for the MPLS label stack. If I understand the draft > correctly, labels for > PE-to-PE LSPs are assigned using some MPLS signalling > protocol (e.g. LDP or > RSVP) and only have local significance for a given core link. > Therefore, it > makes no sense to put a label value in the NEXT-HOP field and > ideally we > should use some well-defined value which means "no label > stack" here. A > problem here is that using the rules specified in [MPLS-BGP] it is not > possible to encode an "empty" label stack. A label stack > always contains at > least one label (and hence at least 3 octets). So what octets > should we use? > I suggest we use 0x800000 to be consistent with the label value in the > MP_UNREACH_NLRI attribute as specified in [MPLS-BGP]. > > In summary, I suggest we change that paragraph to read: > > When a PE router distributes a labeled VPN-IPv4 route via > BGP, it uses > its > own address as the "BGP next hop". This address is > encoded as a labeled > VPN-IPv4 address with a label stack encoded as 0x800000 > and an RD encoded > as > 0x0000000000000000. ... > > > -- Bruno Rijsman > > > > > > Received: from segue.merit.edu (segue.merit.edu [198.108.1.41]) by nic.merit.edu (8.9.3/8.9.1) with ESMTP id NAA01922 for ; Thu, 1 Jun 2000 13:03:45 -0400 (EDT) Received: by segue.merit.edu (Postfix) id BDC255DE00; Thu, 1 Jun 2000 13:03:17 -0400 (EDT) Delivered-To: idr-outgoing@merit.edu Received: by segue.merit.edu (Postfix, from userid 56) id AE2625DDF0; Thu, 1 Jun 2000 13:03:17 -0400 (EDT) Received: from mailrelay00.aa.ops.us.uu.net (postal.aa.ops.us.uu.net [147.225.22.28]) by segue.merit.edu (Postfix) with ESMTP id 2DF215DDE8 for ; Thu, 1 Jun 2000 13:03:16 -0400 (EDT) Received: from uniwest1.redstonecom.com ([199.105.223.130]) by mailrelay00.aa.ops.us.uu.net (8.9.3/8.9.3) with ESMTP id NAA09886 for ; Thu, 1 Jun 2000 13:03:11 -0400 (EDT) Received: by uniwest1.redstonecom.com with Internet Mail Service (5.5.2650.21) id ; Thu, 1 Jun 2000 13:06:04 -0400 Message-ID: <49FF5C6DDBD8D311BBBD009027DE980C0737F6@uniwest1.redstonecom.com> From: "Rijsman, Bruno" To: "'BGP exploder'" , "'IDRP exploder'" Subject: Value for "label" in the NEXT-HOP field for BGP/MPLS VPNs Date: Thu, 1 Jun 2000 13:05:55 -0400 MIME-Version: 1.0 X-Mailer: Internet Mail Service (5.5.2650.21) Content-Type: text/plain; charset="iso-8859-1" Sender: owner-idr@merit.edu Precedence: bulk I have a question about the following paragraph in draft-rosen-rfc2547bis-01: When a PE router distributes a VPN-IPv4 route via BGP, it uses its own address as the "BGP next hop". This address is encoded as a VPN-IPv4 address with an RD of 0. ([BGP-MP] requires that the next hop address be in the same address family as the NLRI.) It also assigns and distributes an MPLS label. (Essentially, PE routers distribute not VPN-IPv4 routes, but Labeled VPN-IPv4 routes. Cf. [MPLS-BGP]). When the PE processes a received packet that has this label at the top of the stack, the PE will pop the stack, and process the packet appropriately. Address family indicates that each NLRI in an MP-REACH-NLRI or MP-UNREACH-NLRI attribute consists of three parts: a) An MPLS label stack b) A route distinguisher c) An IPv4 prefix Since [BGP-MP] requires that the NEXT-HOP field in an MP-REACH-NLRI be encoding in the same address family as the NLRI, I suppose it follows that the NEXT-HOP must also consist of: a) An MPLS label stack b) A route distinguisher c) An IPv4 prefix draft-rosen-rfc2547bis-01 says we should use value 0 for the RD (i.e. use 0x0000000000000000, which is the encoded form of type=0, AS=0, assigned_nr=0). However, draft-rosen-rfc2547bis-01 doesn't mention the value we should use for the MPLS label stack. If I understand the draft correctly, labels for PE-to-PE LSPs are assigned using some MPLS signalling protocol (e.g. LDP or RSVP) and only have local significance for a given core link. Therefore, it makes no sense to put a label value in the NEXT-HOP field and ideally we should use some well-defined value which means "no label stack" here. A problem here is that using the rules specified in [MPLS-BGP] it is not possible to encode an "empty" label stack. A label stack always contains at least one label (and hence at least 3 octets). So what octets should we use? I suggest we use 0x800000 to be consistent with the label value in the MP_UNREACH_NLRI attribute as specified in [MPLS-BGP]. In summary, I suggest we change that paragraph to read: When a PE router distributes a labeled VPN-IPv4 route via BGP, it uses its own address as the "BGP next hop". This address is encoded as a labeled VPN-IPv4 address with a label stack encoded as 0x800000 and an RD encoded as 0x0000000000000000. ... -- Bruno Rijsman Received: from segue.merit.edu (segue.merit.edu [198.108.1.41]) by nic.merit.edu (8.9.3/8.9.1) with ESMTP id KAA28924 for ; Thu, 1 Jun 2000 10:03:48 -0400 (EDT) Received: by segue.merit.edu (Postfix) id AD6AE5DE03; Thu, 1 Jun 2000 10:01:50 -0400 (EDT) Delivered-To: idr-outgoing@merit.edu Received: by segue.merit.edu (Postfix, from userid 56) id 9D9D25DDFD; Thu, 1 Jun 2000 10:01:50 -0400 (EDT) Received: from mailrelay00.aa.ops.us.uu.net (postal.aa.ops.us.uu.net [147.225.22.28]) by segue.merit.edu (Postfix) with ESMTP id 39A325DDF6 for ; Thu, 1 Jun 2000 10:01:48 -0400 (EDT) Received: from uniwest1.redstonecom.com ([199.105.223.130]) by mailrelay00.aa.ops.us.uu.net (8.9.3/8.9.3) with ESMTP id KAA29812 for ; Thu, 1 Jun 2000 10:01:47 -0400 (EDT) Received: by uniwest1.redstonecom.com with Internet Mail Service (5.5.2650.21) id ; Thu, 1 Jun 2000 10:04:39 -0400 Message-ID: <49FF5C6DDBD8D311BBBD009027DE980C0737F5@uniwest1.redstonecom.com> From: "Rijsman, Bruno" To: "'BGP exploder'" , "'IDRP exploder'" Subject: Multiprotocol BGP suggestion Date: Thu, 1 Jun 2000 10:04:32 -0400 MIME-Version: 1.0 X-Mailer: Internet Mail Service (5.5.2650.21) Content-Type: text/plain; charset="iso-8859-1" Sender: owner-idr@merit.edu Precedence: bulk At least 4 SAFI values are currently used with Multiprotocol BGP: SAFI 1 : unicast prefix SAFI 2 : multicast prefix SAFI 3 : unicast & multicast prefix SAFI 4 : labeled prefix SAFI 128 : labeled VPN-IPv4 prefix By looking at this list we see that a prefix may have several attributes: a) It may be unicast or multicast or both b) It may be labeled or not c) It may a public prefix (i.e. without RD) or a VPN (private) prefix (i.e with RD) If fact these attributes of a prefix are orthogonal, meaning that any combination of these attributes makes sense. However, because of the way SAFI values are allocation only certain combinations are permitted. For example, we can construct a unicast labeled VPN prefix (namely SAFI 128) but we can not create a multicast labeled VPN prefix nor a unicast unlabeled VPN prefix nor a multicast labeled public prefix. This is not just a theoretical exercise. It has practical implications - for example that it is currently not possible to carry multicast prefixes accross the core in an BGP/MPLS VPN. The issue could easily be avoided by a different encoding of the AFI value. For example: Bit 0 : is this a unicast prefix (1 = yes, 0 = no). Bit 1 : is this a multicast prefix (1 = yes, 0 = no). Bit 2 : is this a labeled prefix (1 = yes, 0 = no). If yes, a label stack is prepended to the prefix. Bit 3 : is this a private prefix (1 = yes, 0 = no). If yes, a RD is prepended to the prefix. Bits 4-7: reserved (send 0, ignore on receipt) If both a label stack and an RD are present, the label stack should be encoded before the RD. Another side-effect of this is that we can create labeled VPN XYZ prefixes, where XYZ is any address family (IPv6, CLNS, IPX, ...) as identified by the AFI. This would allow us not only to create IPv4 VPNs but also IPv6 VPNs, CLNS VPNs, etc. etc. Does anyone see any merit in doing this? -- Bruno Rijsman