As requested, I have reviewed this document as part of the Operational Directorate's ongoing effort to review all IETF documents being processed by the IESG. These comments were written with the intent of improving the operational aspects of the IETF drafts. Comments that are not addressed in last call may be included in AD reviews during the IESG review. Document editors and WG chairs should treat these comments just like any other last call comments. In my view, although I have concerns as I am about to state, I consider the draft to be ready for IESG review and potential publication as an RFC at Proposed Standard. I have no specific issues I would like to see addressed, nor do I believe the technology or draft to be fundamentally flawed. Speaking in general terms, this draft describes a solution for the problem posed in RFC 5714, which is to say a solution for fast reroute in a network whose routing is implemented using IS-IS and LDP. It is not the only possible solution. In terms of graph theory, we might define a "connected graph" as a set of "nodes" and a set of "links" that interconnect them, such that every node is connected via some sequence of links and nodes to each of the other nodes in the connected graph. The Maximally Redundant Tree model seeks to divide the connected graph into two or more connected sub-graphs, each of which connects the same set of nodes, but using sets of interconnecting links whose intersection set is null, or is at least minimized. In the event that a link in one connected sub-graph fails, the network can continue to use another connected sub-graph to guide routing during the outage. There are obvious degenerate cases, in which the sets of links in sub-graphs are forced to overlap to some degree, or some nodes are not found in all sub-graphs. Part of the architecture is designed to identify those cases (which might occur, for example, in the presence of multiple simultaneous failures, or when the network is inherently deficient for reasons unrelated to and perhaps in violation of the mathematics) and handle them as best it can. As one might imagine, this is not trivial. My first comment on reading the architecture (and on reading the algorithm, which is a separate document) is that the algorithm is complex, and therefore (like anything that is complex) prone to errors and failures of various kinds, and potentially has failure modes that have not yet been detected. This is not to be considered as a strike against it, but a point of caution; the operator using the approach wants to ensure that s/he has the tools necessary to monitor network health, and to quickly discover and correct errors if and when they occur. The algorithm draft contains several proofs of correctness for various parts or in various cases, and refers to papers containing such proofs, with the intent of minimizing the inherent risk. That said, to my knowledge there is not a global proof of correctness, as there is for example in the Shortest Path First algorithm or other algorithms used in the network. The risk is therefore not zero. From the perspective of the IETF, that is precisely the reason a protocol like this should be used operationally at the Proposed Standard level, updated as needed, and ultimately re-released as an Internet Standard when the algorithm and implementations have been operationally proven. With that introduction, the first question in my mind is whether the description is such that two implementors are likely to be able to implement interoperable implementations, or whether ambiguities or lack of clarity would prevent that. This draft identifies two proprietary prototype implementations, by Huawei and Juniper, which if they are interoperable would address the question to a considerable degree. The draft does not, however, describe interoperability testing between them, which at least suggests that this might be yet future. On this score, given the complexity of the design, I personally would be greatly comforted by a test report along the lines of RFC 1246. Since such tests usually find text that needs tweaking, I might suggest that the publication at RFC be delayed until such testing can be performed and the lessons learned, whatever they are, incorporated in the documents. Failing that, experience leads me to believe that there will be subsequent documents that update or obsolete these. The corollary question in my mind is whether an operator reading the architecture will be able to figure out how to effectively use it. On this score, I give the draft a thumbs-up. It is well written, the various issues are raised and dealt with, and the ramifications are in my view clear. Attachment: signature.asc Description: Message signed with OpenPGP using GPGMail