SECDIR review of draft-ietf-rtcweb-security-11 Reviewer: Nancy Cam-Winget Review result: Ready with Nits I have reviewed this document as part of the security directorate'sÊ ongoing effort to review all IETF documents being processed by theÊ IESG.ÊÊThese comments were written primarily for the benefit of theÊ security area directors.ÊÊDocument editors and WG chairs should treatÊ these comments just like any other last call comments. This document defines the threat model and security analysis of WebRTC. There are a few normative clauses to suggest requirements, I think the document could benefit of having more normative requirements to make it a stronger standards track document. Otherwise, it reads more as an informational document. The following are general comments and suggestions (and editorial nits at the end): Section 3. Should it also be noted that as it (browser) has no purview to the actual application running, attacks from the application layer can still occur but is not in scope for WebRTC? Section 4.1 - The conceptual model is a bit confusing, as I think Entity can refer to both the webRTC server as well as the receiving client application? The notion is that the User is trusting the webRTC (client) application (Entity A) to send the media to another user, but really to the receiving application (Entity B). Perhaps model 1 has a typo where the Òtalk toÓ should be Entity B? - The paragraph following the conceptual models is a bit awkward. If I understand correctly, the intent is to state that the user believes Entity A when attempting to call Entity B, Entity A (the client app) could in fact also send it to Entity C? Which is valid, but the writing is awkward as the AÕs, BÕs and CÕs are not explicitly stated as such or at least from the end userÕs viewpoint. - ÒIn either case, all the browser is able to do is verify and check authorization for whoever is controlling where the media goesÓ Is ÒwhoeverÓ meant to be the webRTC application controlling media paths? That should be clarified - ÒÉconsent to access local devices is largely orthogonal to consent to transmitÉÓ It should be both consent to local devices and the deviceÕs resources, unless you mean Òlocal devicesÓ as in camera and microphone; but I think processes and stored or cached files (e.g. other resources) need to be considered too. § - Òconsent to send network traffic is about preventing the userÕs browser from being used to attack its local networkÓ This is true, though, I think you are inferring that the network traffic is enforced to be encrypted, which IÕm not sure is always the case; so I would think it is up to the browser to ensure that this is the case unless the browserÕs policy has made an exception Section 4.1.2.1 ÒÉbug my computerÓ should be clarified. I think there are at least two dimensions to the ÒbugÓ, being it has ÒfreeÓ access to my resources and it can actually potentially ÒlistenÓ to my calls; it could also potentially override the use of my resources. The last sentence on the paragraph ÒNote that question of consentÉ.Ó, the note makes sense, but I am not sure how the last clause ÒÉ.the site is not listening inÓ, can you clarify this? Section 4.1.2.2 ÒÉthe need for a second consentÓ, perhaps it is the need for a ÒdistinctÓ consent as there may not be a previous consent? By ÒdistinctÓ I mean that it is a different type of consent than what may have been granted to the car manuf (in your example) from getting my geolocationÉ.or perhaps I missed the ÒfirstÓ consent type. The last sentence of the paragraph is difficult to parse. I think it is asserting a requirement that the GUI used to launch the call must show the call status (active, continuing, stopped)? This section eludes to granting ÒcallÓ access only for the duration of the call and the access should be limited (Òjust because I want some information on a car doesnÕt meanÉ.Ó); the attack vector should be better highlighted. As I also believe that the browser can only grant the full client application access, so for the duration of the call, it can very well be that the app can get access to my resources beyond just the call (audio only vs. audio + videoÉ.) Section 4.1.3 - The countermeasures can also be combined (e.g. consent is only given to a given user for each call, as well as also having the appropriate keying material). It is subtly eluded to in the 2nd to last paragraph but doesnÕt consider that it can be done for all calls. Section 4.1.4 - It should also be noted that weaknesses in the HTTPS stack can also be exploited (weak authentication or key establishment use) by an attackerÉ.and perhaps should be a MUST enforce strong mutual authentication and key management. Section 4.2.3 - As IÕm unfamiliar to ICE/STUN, IÕm not sure what checks are referred to in ÒÉunsafe to completely remove the requirement for some checkÓ, this should be clarified. Not sure (in the succeeding sentence) if there is a forward reference to proposed checks or if they are listed elsewhere? Section 4.3.1: since the draft is listing requirements, ÒÉ.exchange mechanism imperative forÉÓ The ÒimperativeÓ should be normative ÒMUSTÓ? Section 4.3.2: who is the Òremote endpointÓ? Editorial nits: Section 1. It may be useful to describe why it is Òimmediately apparentÓ that new security challenges resultÉ..suggest remove ÒimmediatelyÓ. Section 4.1.1 (last sentence in the 2nd to last paragraph) is missing a toÕ: Òsophisticated attack would be open up aÓ Section 4.1.3 : ÒNow that we have seen another use caseÓ Seems odd or superfluous. Suggest to just remove that clause or ÒWith the aforementioned use cases, we can startÉ.Ó Section 4.2.1: as it is a first reference, title should call out ÒInteractive Connectivity Establishment (ICE)Ó, same for STUN (and STUNÕs reference, RFC 5389) should be called out. Section 4.2.2 SRTP as a first reference should have its full reference to match the acronym and TURN TCP should have reference too Section 4.2.3 RTCP needs a reference? Section 4.3: is it Òa problem from the SIP worldÓ or Òa problem familiar to the SIP worldÓ? - Extra is Òcalling service is is non-maliciousÓ Section 4.3.1: ÒinÓ should be removed ÒÉ.if end-to-end keying is in usedÓ Section 4.3.2.1: ÒOne natural approach is to use É.Ó, natural approach to what? I think it is to mitigate the during-call attack only?