-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clarify whether "short" representations are permitted #178
Comments
From my perspective of content processing and solution development, I support the view that no gaps should exist and that padding segment should always be used. However, I do not believe IOP requires this. There is an ongoing discussion in the live task force right now with regard to period cutting and it was mentioned that DASH amd 3 added a I would add a related question to you: is there any difficulty from a player developer's perspective when a period ends in the middle of a segment? Can the partial segment be cut without issues? |
The main scenario I describe is for on-demand DASH, where we can easily work out how long each representation is and therefore whether it's shorter than the period without additional signalling. It's not that this case is hugely impractical for a player to support. It's that it's yet another special case that just needn't exist. If a representation is allowed to end early then the player can't make nice assumptions like "there will be a segment I can request for any valid seek position within the period", so you end up with extra code paths through player implementations, which need extra tests etc. To summarize: Allowing short representations appears to add complexity pretty much everywhere for very little actual benefit. Do you think this is something IOP could address, either by explicitly justifying why allowing short representations adds significant benefit, or by recommending that packagers do not do this?
I would say this also falls into the "not hugely impractical, but yet another special case" bucket. So the question I'd pose is: Is this feature absolutely required for a particular use case and/or does it add a really significant benefit over an existing alternative? If the answer is yes then it seems reasonable. If not then I'd much rather see packagers not do this kind of thing. |
We should discourage doing it. I doing, we should encourage to add presentationDuration. The client behaviour in this case should also be documented. |
I would propose something like this as a starting point:
Edit: updated proposal below |
I'd be interested to know what the use case for the last paragraph in the proposal is. I think, from a complexity point of view, it's important to avoid having multiple ways of doing the same thing unless there are strong justifications as to why both are needed. It feels like the serving side should always be able to generate padding segments, and that the proposal makes this the recommended approach. So why is it beneficial to also leave the possibility of not doing so open (with |
The use case that I heard was that one might want to e.g. terminate audio before video intentionally to ensure that it stops at a convenient transition point (e.g. moment of silence) when inserting ads. I can sort of see the use case as having relevance given that you are not always encoding/segmenting and packaging at the same time. You might have sets of already encoded and segmented content that you assemble into multiple periods. In such a case, you cannot just dive into the media stream itself to manipulate it - it is already a done deal. Of course, this would not be a very mainstream feature, hence why it would be discouraged. I do not expect (m)any implementations to really do this. Possibly in this case it might be more sensible to just say "not supported" to discourage even more strongly - I would say this feature is more likely to be misused than it is to be used in the way described above. A lot of what goes into IOP is, in my view, similar to this - a feature that is not going to be commonly used but if it is used, should at least be done in an interoperable way. Perhaps this has some tie-in with regard to how we should handle the key words in v5? #175 |
That use case sounds reasonable, but it doesn't sound like there would ever be insufficient content to provide segments up to the end of the period in that case. Wouldn't it be true that there exists content at least up to the end up the period (and probably beyond)? In which case it would IMO be preferable to still require that segments are provided up to the end of the period, even if |
Your make a fair point. I wonder if this might conflict with how Upon further meditation I realize that a far more appropriate mechanism for signaling such editorial decisions as "drop some seconds of audio" are the custom descriptors that DASH defines. There is no need to modify segment addressing just to mute audio. The descriptors would also allow for a far wider range of flexibility such as a gradual fade-out. As this was the only use case I have run into for dropping segments, I think I can now back the viewpoint that no segments should ever be missing with a clear conscience. Accordingly, I submit an updated proposal: Segments shall be provided for all Representations in a Period up until the end of the Period. If necessary, padding segments containing empty/blank/silent samples shall be supplied to ensure there is no gap near the end of a Period. The last segment may extend beyond the Period end point. Clients shall ignore any samples that exceed the bounds of the Period.
For editorial manipulation, custom descriptors can be proposed by who needs them. |
That proposal sounds good to me. |
This came up in yesterday's call again, the point raised being that there exists a lot of on-demand content that does not conform to this. I agree that such content exists. I claim that such content should not be classified as interoperable content and should be considered outside DASH-IF IOP profiles. To make on-demand content with "short" representations interoperable, the following possibilities exist:
The idea that period timing is only "rough" and should not be relied upon for exact timing was also expressed in the call. I do not agree with this interpretation and expect period timing to always be accurate - the DASH timing model would be quite badly affected if period timing could not be relied upon. By accurate I mean:
Missing segments are a related but separate topic that I think is not important here. We should expect this for both live and on-demand profile. |
@sandersaares I agree pretty thoroughly with your reasoning here, having streamed representations that are not aligned throughout the period make the timing model a lot harder and the player side fix of blank frames or silent audio samples is extremely dependent on the underlying device platform capabilities. For a good number of mass market devices the underlying decoding pipelines do not handle these empty segments well and result in pipeline failures, not mandating the usage of filler segments should make implementing an interoperable player far easier. From our (Hulu) internal player work, the timing model works best when the period timing is accurately described by either explicit With the periods defining the overall timeline it is possible for the video and audio elementary streams to be sparse, but again you have to coupe with underlying pipelines not handling this sparseness well, so it is best to avoid sparseness if possible. For text and event streams that do not rely on underlying platform pipelines (at least in our experience), the sparseness is easily handled as the text and event streams have explicit timing for their elements and do not require the generation of filler data. The one point you make that I would describe as hard to follow as a player is:
This relies on a lot of control over the underlying media pipelines which you cannot always achieve, MSE based players using encrypted content would have trouble doing this for seamless transitions for instance. That said I believe the |
|
👍 I like the direction you guys have taken on this topic. Indeed it makes much more sense to throw away some left over media than to play media that isn't there. I have a small question though on how non-compliant contents would be treated from IOP perspective.
In the typical case of a (non sparse) subtitle track that ends way earlier the A/V credits, would it be fair to say that such a presentation ends instantly when the shortest (ie subs) track ends? That is, in order to comply, the presentation could be altered to:
|
I would rather say that such content is non-interoperable and consistent behavior cannot be expected across a wide range of client systems. I would not say that the presentation ends there because you can only define the end point when operating under a common understanding of the timing model, which such content violates.
That would result in following the interoperable timing model and would indeed be a good approch.
Also works, although I suspect less desirable.
I am not aware of such a concept as sparse tracks in DASH. A fourth option (and IMO the easiest) would be to start a new period at the point where the subtitle track ends, with the new period not having any subtitles. The other tracks could be designated as period-connected and (provided client system support for seamless playback) continue playback seamlessly while also properly terminating the subtitle track. |
Multiple periods is actually a sound solution I had not thought of. That makes sense! It obviously puts some strain on stitching implied or unintentional discontinuities introduced by the period edges, as the end of subtitle track may not coincide with start of new GOP and/or audio access point, but at least we would not need to worry about partial tracks inside those periods anymore. Should give packager and player development an incentive to focus on getting the period transitions right. |
This topic was used as feedback into the formulation of the interoperable timing model. As there has been no further discussion here for some time, I close it. |
For on-demand DASH, within a period of a specified duration
d
, is it allowed for some representations to have durations significantly shorter thand
if they would otherwise end with "empty" segments? This question arises specifically for caption/subtitle representations where there are no captions/subtitles near the end of the content (e.g. during credits). At least one packager I'm aware of lets caption/subtitle representations have significantly shorter durations in this case, rather than padding up to durationd
with empty segments.Is it explicitly specified either way whether this kind of "short" representation is permitted? If not, would it be possible to add a requirement one way or another to the DASH IF guidelines?
From my point of view as a player developer, I would rather see a requirement that such representations are always padded up to
d
with empty segments because:It's common to see large gaps in subtitles in the middle of streams as well as at the end, and for that case padding with empty segments is the solution. It seems inconsistent to treat the end of the stream any differently.
For live streams that haven't ended yet, but haven't had any subtitles for a while, there's no real alternative to padding with empty segments, else the player can't disambiguate between "no subtitles" and "subtitle segments are late being added to the manifest". If a live stream ends and turns into an on-demand stream, it will therefore end with empty segments. It seems nice for this and content that was only ever on-demand to be consistent.
It simplifies player implementations, because it avoids having to handle cases like seeking into part of the period where some (but not all) representations have ended.
Conversely (and for completeness), some benefits of explicitly allowing representations to end early are:
The text was updated successfully, but these errors were encountered: