Cullen Jennings
2017-05-15 16:59:47 UTC
Overall the actually protocol described by the spec seems fine with a few very small technical details. However, it is a bit challenging to understand the basic overview of how it works - I have a few suggestions that I think would be not much work that would really help people get up to speed by moving some of the later example text to closer to where the overview section is now.
Technical Issues -----------
Issue 1: Pause in an SDP re-offer
The pause in section 6.3.4 seems problematic. If A is sending a re-offer to B, there is no real way to know the pause state of a stream B is sending to A at that point without glare issues. Saying A has to send something that matches a state on B is not really possible to implement.
I think this should be changed to say A sets the pause flags in the SDP for RIDs A is sending to match what A currently thinks the state of them is, and A sets the the pause flags for the RIDs A is receiving to what A wished they were. The way that B processes the offer and creates the answer as well as the handling of the answer by A would be the same as the initial offer/answer.
Given the timing issues of RTCP near at the start of the call, if we don't make this change, it may take significant time to un-pause a particular stream at the start of the call. The above change allows that to happen so that if a user joined a call that perhaps had a paused presentation stream and wanted to un-pause that right away, they could.
Issue 2: More than one simulcast line
Some SDP stacks to not preserve the order of a=lines. I have no idea if this is a bug in them or not. I think it would be better to phrase this spec as each m= section MUST have at most one a=simulcast line. The current phrasing of receiver ignores all but first just begs for people to put proprietary stuff in a second simulcast line.
Issue 3: Relating simulcast streams using PT
When the PT are unique, they can be used instead of RID. Obviously, if they are not unique they can't be used, but when they are, they often result in being able to display the video sooner. Often RID/MID will not be included in every RTP packet because of the bandwidth usage and are instead sent periodically. Being able to join a conference and start displaying stuff right away is nice when possible
I think section 6.5 needs to be updated to be clear that "RTP "Simulcast streams MUST be related on RTP level through RID." still means it is fine for the RTP receiver to use the PT of the RTP packet and if that uniquely maps to a RID in the SDP, use that for the relation.
Editorial ----------------
The use cases and requirements go for awhile and the first thing that starts to explain how this works is bullet point point 4 in the overview section which says
o The codec configuration for a simulcast stream is expressed
through use of separately specified RTP payload format
restrictions [I-D.ietf-mmusic-rid] with an associated RTP-level
identification mechanism [I-D.ietf-avtext-rid] to identify which
RTP payload format restrictions an RTP stream adheres to. This
complements and effectively extends simulcast stream
identification and configuration possibilities that could be
provided by using only SDP formats as identifier. Use of multiple
RTP streams with the same (non-redundancy) media type in the
context of a single media source, where those RTP streams are
using different RtpStreamId, is a strong but not totally
unambiguous indication of those RTP streams being part of a
simulcast.
Have a skim of the draft up to that point and ask yourself how much sense it would make it you get to this point? Or how much you would understand about how the mechahims of this draft actually worked before you got to the ABNF shortly after this. I suspect that you will likely come to the conclusion that it's a bit hard to understand the big picture of how it works.
I really can't make head or tails of the paragraph I quoted above but the draft would make more sense to me if we removed that, along with rest of section 5. From the bullet point above, the draft goes straight into the ABNF for the new stuff. I think it would be easier to understand if we: Moved section 4 to appendix, Greatly shortened section 3. Then add an overview that starts with showing the offer and answer from 6.6.1. and explaining how the only thing this draft adds is the a=simulcast line. Explain the one m line means one source concept. Then show how simulcast line allows sending the 1;2 RIDs. Then explain how alternatives work. From there jump into the details. I don't think this would be much new text and it would not change how anything works, just clear up the explanation.
Section 6.1. and 6.2 are confusing because they are written as if they are not for offer/answer SDP. I think it would be better to state up front as that this was offer/answer SDP only and write theses sections to be clear about if they are referring to offers or answers when talking about SDP. As a specific example, the pause discussion on page 13 might be correct for answers but looks less correct for offers.
The SCID is really confusing in the draft. The draft is never fully clear about if this is a RID or not it calls them "identical too" but that hard to see if they are different but same value or work the same way or something else. I think we should remove the term SCID from the draft and just use RID. Similarly, using RtpStreamID is confusing. I think we should just refer to that as the header that carries the RID.
The example on the top of page 11 makes no sense without enough of the SDP to see the rids and m lines etc and needs to be broken apart to be a Offer example followed by the Answer back to that offer.
NIT - defined SFM on first use
Technical Issues -----------
Issue 1: Pause in an SDP re-offer
The pause in section 6.3.4 seems problematic. If A is sending a re-offer to B, there is no real way to know the pause state of a stream B is sending to A at that point without glare issues. Saying A has to send something that matches a state on B is not really possible to implement.
I think this should be changed to say A sets the pause flags in the SDP for RIDs A is sending to match what A currently thinks the state of them is, and A sets the the pause flags for the RIDs A is receiving to what A wished they were. The way that B processes the offer and creates the answer as well as the handling of the answer by A would be the same as the initial offer/answer.
Given the timing issues of RTCP near at the start of the call, if we don't make this change, it may take significant time to un-pause a particular stream at the start of the call. The above change allows that to happen so that if a user joined a call that perhaps had a paused presentation stream and wanted to un-pause that right away, they could.
Issue 2: More than one simulcast line
Some SDP stacks to not preserve the order of a=lines. I have no idea if this is a bug in them or not. I think it would be better to phrase this spec as each m= section MUST have at most one a=simulcast line. The current phrasing of receiver ignores all but first just begs for people to put proprietary stuff in a second simulcast line.
Issue 3: Relating simulcast streams using PT
When the PT are unique, they can be used instead of RID. Obviously, if they are not unique they can't be used, but when they are, they often result in being able to display the video sooner. Often RID/MID will not be included in every RTP packet because of the bandwidth usage and are instead sent periodically. Being able to join a conference and start displaying stuff right away is nice when possible
I think section 6.5 needs to be updated to be clear that "RTP "Simulcast streams MUST be related on RTP level through RID." still means it is fine for the RTP receiver to use the PT of the RTP packet and if that uniquely maps to a RID in the SDP, use that for the relation.
Editorial ----------------
The use cases and requirements go for awhile and the first thing that starts to explain how this works is bullet point point 4 in the overview section which says
o The codec configuration for a simulcast stream is expressed
through use of separately specified RTP payload format
restrictions [I-D.ietf-mmusic-rid] with an associated RTP-level
identification mechanism [I-D.ietf-avtext-rid] to identify which
RTP payload format restrictions an RTP stream adheres to. This
complements and effectively extends simulcast stream
identification and configuration possibilities that could be
provided by using only SDP formats as identifier. Use of multiple
RTP streams with the same (non-redundancy) media type in the
context of a single media source, where those RTP streams are
using different RtpStreamId, is a strong but not totally
unambiguous indication of those RTP streams being part of a
simulcast.
Have a skim of the draft up to that point and ask yourself how much sense it would make it you get to this point? Or how much you would understand about how the mechahims of this draft actually worked before you got to the ABNF shortly after this. I suspect that you will likely come to the conclusion that it's a bit hard to understand the big picture of how it works.
I really can't make head or tails of the paragraph I quoted above but the draft would make more sense to me if we removed that, along with rest of section 5. From the bullet point above, the draft goes straight into the ABNF for the new stuff. I think it would be easier to understand if we: Moved section 4 to appendix, Greatly shortened section 3. Then add an overview that starts with showing the offer and answer from 6.6.1. and explaining how the only thing this draft adds is the a=simulcast line. Explain the one m line means one source concept. Then show how simulcast line allows sending the 1;2 RIDs. Then explain how alternatives work. From there jump into the details. I don't think this would be much new text and it would not change how anything works, just clear up the explanation.
Section 6.1. and 6.2 are confusing because they are written as if they are not for offer/answer SDP. I think it would be better to state up front as that this was offer/answer SDP only and write theses sections to be clear about if they are referring to offers or answers when talking about SDP. As a specific example, the pause discussion on page 13 might be correct for answers but looks less correct for offers.
The SCID is really confusing in the draft. The draft is never fully clear about if this is a RID or not it calls them "identical too" but that hard to see if they are different but same value or work the same way or something else. I think we should remove the term SCID from the draft and just use RID. Similarly, using RtpStreamID is confusing. I think we should just refer to that as the header that carries the RID.
The example on the top of page 11 makes no sense without enough of the SDP to see the rids and m lines etc and needs to be broken apart to be a Offer example followed by the Answer back to that offer.
NIT - defined SFM on first use